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NOVEL NUCLEIC ACIDS AND POLYPEPTIDES 

1. TECHNICAL FIELD 

The present invention provides novel polynucleotides and proteins encoded by such 
5 polynucleotides, along v\ath uses for these polynucleotides and proteins, for example in 
therapeutic, diagnostic and research joiethods, 

2. BACKGROUND 

Technology aimed at the discovery of protein factors (including e.g., cytokines, such as 

10 iymphokines, interferons, CSFs, chemokines, and interleukins) has matured rapidly over the past 
decade. The now routine hybridization cloning and expression cloning techniques clone novel 
polynucleotides "directly" in the sense that they rely on infomiation directly related to the 
discovered protein (i.e,, partial DNA/amino acid sequence of the protein in the case of 
hybridization cloning; activity of the protein in the case of expression cloning). More recent 

1 5 "indirect" cloning techniques such as signal sequence cloning, which isolates DNA sequences 
based on the presence of a now well-recognized secretory leader sequence motif, as well as 
various PCR-based or low stringency hybridization-based cloning techniques, have advanced the 
state of the art by making available large numbers of DNA/amino acid sequences for proteins 
that are known to have biological activity, for example, by virtue of their secreted nature in the 

20 case of leader sequence cloning, by virtue of their ceil or tissue source in the case of PCR-ba^^d 
techniques, or by virtue of structural similarity to other genes of known biological activity. 

Identified polynucleotide and polypeptide sequences have numerous applications in, fc^^ 
example, diagnostics, forensics, gene mapping; identification of mutations responsible for 
genetic disorders or other traits, to assess biodiversity, and to produce many other types of data 

25 and products dependent on DNA and amino acid sequences, 

3. SUMMARY OF THE INVENTION 

The compositions of the present inventiori include novel isolated polypeptides, novel 
isolated polynucleotides encoding such polypeptides, including recombinant DNA molecules, 
30 cloned genes or degenerate variants thereof, especially naturally occurring variants such as allelic 
variants, antisense polynucleotide molecules, and antibodies that specifically recognize one or more 
epitopes present on such polypeptides, as well as hybridomas producing such antibodies. 

The compositions of the present invention additionally include vectors, including expression 
vectors, containing the polynucleotides of the invention, cells genetically engineered to contain such 
3 5 polynucleotides and cells genetically engineered to express such polynucleotides. 
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The present mvention relates to a collection or libraiy of at least one novel nucleic ^cid , 
sequence assembled from expressed sequence tags (ESTs) isolated mainly by sequencing by 
hybridization (SBH), and in some cases, sequences obtained from one or more public databases. 
The invention relates also to the proteins encoded by such polynucleotides, along with therapeutic, 
5 diagnostic and research utilities for these polynucleotides and proteins. These nucleic acid 

sequences are designated as SEQ ID NO: 1-1786 and 3573-5358. The polypeptides sequences are 
designated SEQ ID NO: 2n (wherein n = 1 to 20). The nucleic acids and polypeptides are provided 
in the Sequence Listing. In the nucleic acids provided in the Sequence Listing, A is adenosine; C is 
cytosine; G is guanine; T is thymine; and N is any of the four bases. In the amino acids provided in 

1 0 the Sequence Listings * corresponds to the stop codon. 

The nucleic acid sequences of the present invention also include, nucleic acid sequences that 
hybridize to the complement of SEQ ID NO : 1 - 1 7 8 6 and 3573-5358 under stringent hybridi2:ation 
conditions; nucleic acid sequences which are allelic variants or species homologues of any of the 
nucleic acid sequences recited above, or nucleic acid sequences that encode a peptide comprising a 

1 5 specific domain or truncation of the peptides encoded by SEQ ID N0:l-1 786 and 3573-5358 . A 
polynucleotide comprising a nucleotide sequence having at least 90% identity to an identifying 
sequenceof SEQIDNO:l-1786and 3573-5358 or a degenerate variant or fragment thereof The 
identifying sequence can be 1 00 base pairs in length. 

The nucleic acid sequences of the present invention also include the sequence information 

20 from the nucleic acid sequences of SEQ ID NO: 1-1786 and 3 573 -5 35 8 . The sequence infommtion 
can be a segment of any one of SEQ ID NO: 1-1786 and 3573-5358 that uniquely identifies or 
represents the sequence information of SEQ ID NO: 1-1 786 and 3573-5358. 

A collection as used in this application can be a collection of only one polynucleotide. Tlie 
collection of sequence information or identifying information of each sequence can be provided on 

25 a nucleic acid array. In one embodiment, segments of sequence information is provided on a 

nucleic acid array to detect the polynucleotide that contains the segment. The array can be designed 
to detect full-match or mismatch to the polynucleotide that contains the segment. The collection 
can also be provided in a computer-readable format. 

This invention also includes the reverse or direct complement of any of the nucleic acid 

3 0 sequences recited above; cloning or expression vectors containing the nucleic acid sequences; and 
host cells or organisms transformed with these expression vectors. Nucleic acid sequences (or their 
reverse or direct complements) according to the invention have numerous applications in a variety 
of techniques loiown to those skilled in the art of molecular biology, such as use as hybridization 
probes J use as primers for PCR^ use in an array, use in computer-readable media, use in sequencing 
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full-length genes, use for clu'omosome and gene mapping, use in the recombinant productioh of 
protein, and use in the generation of anti-sense DN A or RN A, their chemical analogs and the like. 

In a preferred embodiment, the nucleic acid sequences of SEQ ID NO: 1-1786 atid 3573- 
5358 or novel segments or parts of the nucleic acids of the invention are used as primers in 
5 expression assays that are well known in the art. In a particularly preferred embodiment, the nucleic 
acid sequences of SEQ ID NO; 1 -1786 and 3573-5358 or novel scgmQUts or parts of the nucleic 
acids provided herein are used in diagnostics for identifying expressed genes or, as well knoym in 
the art and exemplified by Vollrath et ah, Science 258:52-59 (1992), as expressed sequence tags for 
physical mapping of the human genome. 
1 0 The isolated polynucleotides of the invention include, but are not limited to, a 

polynucleotide comprising any one of the nucleotide sequences set forth in SEQ ID NO: 1-1786 and 
3573-5358; a polynucleotide comprising any of the full length protein coding sequences of SEQ ID 
NO : 1 - 1 7 86 and 3 5 73 -53 5 8; and a polynucleotide comprising any of the nucleotide sequences of the 
matmo protein coding sequences of SEQ ID NO; 1 -1 786 and 3573-5358. The polynucleotides of the 
1 5 present invention also include, but are not Jitnifed to, a polynucleotide that hybridizes under 

stringent hybridization conditions to (a) the complement of any one of the nucleotide sequences set 
forth in SEQ ID NO:l-1786 and 3573-5358; (b) a nucleotide sequence encoding any one of the 
amino acid sequences set forth in the Sequence Listing; (c) a polynucleotide which is an allehc 
variant of any polynucleotides recited above; (d) a polynucleotide which encodes a species homolog 
20 (e.g. orthologs) of any of the proteins recited above; or (e) a polynucleotide that encodes a 

po ^peptide comprising a specific domain or truncation of any of the polypeptides comprising an 
amino acid sequence set forth in the Sequence Listing. 

The isolated polypeptides of Hhe invention include, but are not limited to, a polypeptide 
comprising any of the amino acid sequences set forth in the Sequence Listing; or the corresponding 
25 full length or mature protein. Polypeptides of the invention also include polypeptides with biological 
activity that are encoded by (a) any of the polynucleotides having a nucleotide sequence set forth in 
SEQ ID NO: 1-1786 and 3573-5358; or (b) polynucleotides that hybridize to the complement of the 
polynucleotides of (a) under stringent hybridization conditions. Biologically or immunologically 
active variants of any of the polypeptide sequences in the Sequence Listing, and "substantial 
30 equivalents" thereof (e.g., with at least about 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% 
amino acid sequence identity) that preferably retain biological activity are also contemplated. The 
polypeptides of the invention may be wholly or paitially chemically synthesized but are preferably 
produced by recombinant means using the genetically engineered cells (e.g. host cells) of the 
invention. 
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The invention also provides compositions comprising a polypeptide of the invention. 
Polypeptide compositions of the invention may further comprise an acceptable carrier, such as a 
hydrophilic, e.g., pharmaceutically acceptable, carxier. 

The invention also provides host cells transformed or transfected with a polynucleotide of 
5 the invention. 

The invention also relates to methods for producing a polypeptide of the inventioA 
comprising growing a culture of the host cells of the invention in a suitable culture mediiim 
under conditions permitting expression of the desired polypeptide, and purifying the polypeptide 
from the culture or fi:om the host cells. Preferred embodiments include those in which the 
10 protein produced by such process is a mature form of the protein. 

Polynucleotides according to the invention have numerous applications in a variety of 
techniques known to those skilled in the art of molecular biology. These techniques include use 
as hybridization probes, use as oligomers, or primers, for PGR, use for chromosome and gene 
mapping, use in the recombinant production of protein, and use in generation of anti-sense DNA 
15 or RNA, their chemical analogs and the like. For example, when the expression of an mRNA is 
largely restricted to a particular cell or tissue type, polynucleotides of the invention can be used 
as hybridization probes to detect the presence of the particular cell or tissue mRNA in a sample 
using, e.g., in situ hybridization. 

In other exemplary embodiments, the polynucleotides are used in diagnostics as 
20 expressed sequence tags for identifying expressed genes or, as well known in the art and k 

exemplified by Vollrath et al. Science 258:52-59 (1992), as expressed sequence tags for physical 
mapping of the human genome. 

The polypeptides according to the invention can be used in a variety of conventional 
procedures and methods that are currendy applied to other proteins. For example, a polypeptide 
25 of the invention can be used to generate an antibody that specifically binds the polypeptide. Such 
antibodies, particularly monoclonal antibodies, are useful for detecting or quantitating the 
polypeptide in tissue. The polypeptides of the invention can also be used as molecular weight 
markers, and as a food supplement. 

Methods are also provided for preventing, treatmg, or ameliorating a medical condition 
30 which comprises the step of administering to a mammalian subject a therapeutically effective 
amount of a composition comprising a polypeptide of the present invention and a 
pharmaceutically acceptable carrier. 

In particular, the polypeptides and polynucleotides of the invention can be utilized, for 
example, in methods for the prevention and/or treatment of disorders involving aberrant protein 
35 expression or biological activity. 

4 
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The present invention further relates to methods for delecting the presence of the, 
polynucleotides or polypeptides of the invention in a sample. Such methods can, for example, be 
utilized as part of prognostic and diagnostic evaluation of disorders as recited herein and for the 
identification of subjects exhibiting a predisposition to such conditions. The invention provides 
5 a method for detecting the polynucleotides of the invention in a sample, comprising contacting 
the sample with a compound that binds to and forms a complex with tire polynucleotide of 
interest for a period sufficient to form the complex and under conditions sufficient to form a 
complex and detecting the complex such that if a complex is detected, the polynucleotide^ of 
mterest is detected. The invention also provides a method for detecting the polypeptides of the 
1 0 invention in a sample comprising contacting the sample with a compound that binds to and forms 
a complex with the polypeptide under conditions and for a period sufficient to form the complex 
and detecting the formation of the complex such that if a complex is formed, the polypeptide is 
detected. 

The invention also provides kits comprising polynucleotide probes and/or monoclonal 
1 5 antibodies, and optionally quantitative standards, for carrying out methods of the invention, 
Furthennore, the invention provides methods for evaluating the efficacy of drugs, and 
monitoring the progress of patients, involved in clinical trials for the treatment of disorders as 
recited above. 

The invention also provides methods for the identification of compounds that modulate 
20 (i.e., increase or decrease) the expression or activity of the polynucleotides and/or polypeptic^^ 
of the invention. Such methods can be utilized, for example, for the identification of compounds 
that can ameliorate symptoms of disorders as recited herein. Such methods can include, but are 
not limited to, assays for identifying compounds and other substances that interact with (e.g., 
bind to) the polypeptides of the invention. The invention provides a method for identifying a 
25 compound that binds to the polypeptides of the invention comprising contacting the compound 
with a polypeptide of tlie invention in a cell for a time sufficient to form a polypeptide/compound 
complex, wherein the complex drives expression of a reporter gene sequence in the cell; and 
detecting the complex by detecting the reporter gene sequence expression such that if expression 
of the reporter gene is detected the compound the binds to a polypeptide of the invention is 
30 identified. 

The methods of the invention also provides methods for treatment which involve the 
administration of the polynucleotides or polypeptides of the kivention to individuals exhibiting 
symptoms or tendencies. In addition, the invention encompasses metliods for treating diseases or 
disorders as recited herein comprising administering compounds and other substances that 
3 5 modulate the overall activity of the target gene products. Compounds and other substance! can 
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effect such modulation either on the level of target gene/protein expression or target protein 
activity - 

The polypeptides of the present invention and the polynucleotides encoding them are also 
useful for the same functions known to one of skill in the art as the polypeptides and 
polynucleotides to which they have homology (set forth in Table 2); for which they have a 
signature region (as set forth in Table 3); or for which they have homology to a gene family (as 
set forth in Table 4), If no homology is set forth for a sequence, then the polypeptides and 
polynucleotides of the present invention are useful .for a variety of applications, as described 
herein, including use in arrays for detection. 

4. DETAILED DESCRIPTION OF THE INVENTION 



4.1 DEFINITIONS 

It must be noted that as used herein and in the appended claims, the singular forms "a", 
15 "an" and "the*' include plural references unless the context clearly dictates otherwise. 

The term "active" refers to those forms of the polypeptide which retain the biologic 
and/or immunologic activities of any naturally occurring polypeptide. According to the 
invention, the terms "biologically active" or "biological activity" refer to a protein or peptide 
having structural, regulatory or biochemical functions of a naturally occurring molecule. 
20 Likewise "immunologically active" or "immunological activity" refers to the capability of tl\e 
natural, recombinant or synthetic polypeptide to induce a specific immune response in 
appropriate animals or cells and to bind with specific antibodies. 

The term "activated cells" as used in this application are those ceils which are engaged in 
extracellular or intracellular membrane traffickingj including the export of secretory or ' 
25 enzymatic molecules as part of a normal or disease process. 

The terms "complementary" or "complementarity" refer to the natural binding of 
polynucleotides by base pairing. For example, the sequence 5'-AGT-3' binds to the 
complementary sequence 3'-TCA-5'. Complementarity between two single-stranded molecules 
may be "partial" such that only some of the nucleic acids bind or it may be "complete" such that 
30 total complementarity exists between the single stranded molecules. The degree of 

complementarity between the nucleic acid strands has significant effects on the efficiency and 
strength of the hybridization between the nucleic acid strands. 

The term "embryonic stem cells (ES)" refers to a cell that can give rise to many 
differentiated cell types in an embryo or an adult, including the germ cells. The term "germ line 
35 stem cells (GS Cs)" refers to stem cells derived from primordial stem cells that provide a Steady 
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and continuous source of germ cells for the production of gametes. The term "primordial germ 
cells (PGCs)" refers to a small population of cells set aside from other cell lineages particularly 
from the yolk sac, mesenteries, or gonadal ridges during embryogenesis that have the potential to 
differentiate into germ ceils and other cells. PGCs are the source from which GSCs and ES cells 
5 are derived The PGCs, the GSCs and the ES cells are capable of self-renewal. Thus these cells 
not only populate the germ line and give rise to a plurality of terminally differentiated cells that 
comprise the adult specialized organs, but are able to regenerate themselves. 

The term "expression modulating fragment," EMF, means a series of nucleotides .which 
modulates the expression of an operably linked ORE or another EMF. 

1 0 As used herein, a sequence is said to "modulate the expression of an operably linked 

sequence" when the expression of the sequence is altered by the presence of the EMF. EMFs 
include, but are not limited to^ promoters, and promoter modulating sequences (inducible 
elements). One class of EMFs are nucleic acid fragments which induce the expression of an 
operably linked ORE in response to a specific regulatory factor or physiological event. 

1 5 The terms "nucleotide sequence" or "nucleic acid" or ''polynucleotide" or 

'*'oligonculeotide" are used interchangeably and refer to a heteropolymer of nucleotides or the 
sequence of these nucleotides. These phrases also refer to DNA or RNA of genomic or synthetic 
origin which may be single-stranded or double-stranded and may represent the sense or the 
antisense strand, to peptide nucleic acid (PNA) or to any DNA-like or RNA-Iike material. In the 

20 sequences herein A is adenine, C is cytosine, T is thymine, G is guanine and N is A, C, G or^T' 
(U). It is contemplated that where the polynucleotide is RNA, the T (thymine) in the sequences 
provided herein is substituted with U (uracil). Generally, nucleic acid segments provided by this 
invention may be assembled from fragments of the genome and short oligonucleotide linkers, or 
from a series of oligonucleotides, or from individual nucleotides, to provide a synthetic nucleic 

25 acid which is capable of being expressed in a recombinant transcriptional unit comprising 
regulatory elements derived from a microbial or viral operon, or a eukaryotic gene. 

the terms "oligonucleotide fragment" or a "polynucleotide fragment", "portion," or 
"segment" or "probe" or "primer" are used interchangeably and refer to a sequence of nucleotide 
residues which are at least about 5 nucleotides, more preferably at least about 7 nucleotides, 

30 more preferably at least about 9 nucleotides, more preferably at least about 1 1 nucleotides and 
most preferably at least about 1 7 nucleotides. The fragment is preferably less than about 500 
nucleotides, preferably less than about 200 nucleotides, more preferably less than about 100 
nucleotides, more preferably less than about 50 nucleotides and most preferably less than 30 
nucleotides. Preferably the probe is from about 6 nucleotides to about 200 nucleotides, 

35 preferably from about 15 to about 50 nucleotides, more preferably from about 17 to 30 ^ 
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nucleotides and most preferably from about 20 to 25 nucleotides. Preferably the fraginents can 
be used in polymerase chain reaction (PGR), various hybridization procedures or microarray 
procedures to identify or amplify identical or related parts of mRNA or DNA molecules. A 
fragment or segment may uniquely identify each polynucleotide sequence of the present 
invention. Preferably the fragment comprises a sequence substantially similar to any one of SEQ 
IDNOs:l-20. 

Probes may, for example, be used to determine whether specific mRNA molecules are 
present in a cell or tissue or to isolate similar nucleic acid sequences from chromosomal DNA as 
described by Walsh et ah (Walsh, P.S. et ah, 1992, PGR Methods Appl 1:241-250), They may 
be labeled by nick translation, Klenow fill-in reaction, PGR, or other methods well loiown in the 
ail. Probes of the present invention, their preparation and/or labeling are elaborated in 
Sambrook, J. et al., 1989, Molecular Cloning: A Laboratoiy Manual, Cold Spring Harbor 
Laboratory, NY; or Ausubel, P.M. et al., 1989, Current Protocols in Molecular Biology, John 
Wiley & Sons, New York NY, both of which are incorporated herein by reference in their 
entirety. 

The nucleic acid sequences of the present invention also include the sequence 
information from the nucleic acid sequences of SEQ ID NO: 1-1 786 and 3573-5358. The 
sequence information can be a segment of any one of SEQ ID NO:l-1786 and 3573-5358 that 
uniquely identifies or represents the sequence information of tliat sequence of SEQ ID NO:l- 
1786 and 3573-5358. One such segment can be a twenty-mer nucleic acid sequence because.4he 
probability that a twenty-mer is Mly matched in the human gencn.e is 1 in 300. In the human 
genome, there are three billion base pairs in one set of chromosomes. Because 4^^ possible 
twenty-mers exist, there are 300 times more twenty-mers than there are base pairs in a set of 
human chromosomes. Using the same analysis, the probability for a seventeen-mer to be fully 
matched in the human genome is approximately 1 in 5. When these segments are used in arrays 
for expression studies, fifteen-mer segments can be used.- The probability that the fifteen-mer is 
fully matched in the expressed sequences is also approximately one in five because expressed 
sequences comprise less than approximately 5% of the entire genome sequence. 

Similarly, when using sequence information for detecting a single mismatch, a segment can 
be a twenty-five men The probability that the twenty-five mer would appear in a himian genome 
with a single mismatch is calculated by multiplying the probability for a foil match (1-^4^^) tunes the 
increased probability for mismatch at each nucleotide position (3 x 25). The probability that an 
eighteen mer with a single mismatch can be detected in an array for expression studies is 
approximately one in five. The probability that a twenty-mer with a single mismatch can be 
detected in a human genome is approximately one in five. 

8 
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The tenn "open reading frame," ORF, means a series of nucleotide triplets coding for 
amino acids without any termination codons and is a sequence translatable into protein. 

The terms "operably Hnked" or "operably associated" refer to functionally related nucleic 
acid sequences. For example, a promoter is operably associated or operably linked with a coding 
sequence if the promoter controls the transcription of the coding sequence. While operably 
linked nucleic acid sequences can be contiguous and in the same reading frame, certain genetic 
elements e.g. repressor genes are not contiguously linked to the coding sequence but still control 
transcription/translation of the coding sequence. 

The term "pluripotent" refers to the capability of a cell to differentiate into a number of 
differentiated cell types that are present in an adult organism, A pluripotent cell is restricted in its 
diiferentiation capability in comparison to a totipotent cell 

The terms "polypeptide" or "peptide" or "amino acid sequence" refer to an oligopeptide, 
peptide, polypeptide or protein sequence or fragment thereof and to naturally occurring or 
synthetic molecules. A polypeptide "fragment," "portion/' or "segment" is a stretch of amino 
1 5 acid residues of at least about 5 amino acids, preferably at least about 7 amino acids, more 
preferably at least about 9 amino acids and most preferably at least about 17 or more amino 
acids. The peptide preferably is not greater than about 200 amino acids, more preferably less 
than 1 50 amino acids and most preferably less than 1 00 amino acids. Preferably the peptide is 
from about 5 to about 200 amino acids. To be active, any polypeptide must have sufficient 
20 length to display biological and/or immunological activity. ^ ^ 

The term "nahirally occurring polypeptide" refers to polypeptides produced by cells that 
have not been genetically engineered and specifically contemplates various polypeptides arising 
from post-translationai modifications of the polypeptide including, but not limited to, acetylation, 
carboxylation, glycosylation, phosphorylation, lipidation and acylation. 

The term "translated protein coding portion" means a sequence which encodes for the fiiU 
length protein which may mclude any leader sequence or any processing sequence. 

The term '^mature protein coding sequence" means a sequence which encodes a peptide 
or protein without a signal or le ider sequence. The "mature protein portion" means that portion 
of the protein which does not include a signal or leader sequence. The peptide may have been 
30 produced by processing in the cell which removes any leader/signal sequence. The mature 

protein portion may or may not include the initial methionine residue. The methionine residue 
may be removed from the protein during processing in the cell. The peptide may be produced 
synthetically or the protein may have been produced using a polynucleotide only encoding for 
the mature protein coding sequence. 

1 
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The terra "derivative" refers to polypeptides chemically modified by such teclyiiques as 

ubiquitination, labeling (e.g., with radionuclides or various enzymes), covalent polymer 

attachment such as pegylation (derivatization with polyethylene glycol) and insertion or 

substitution by chemical synthesis of amino acids such as ornithine, which do not normally occur 

5 in human proteins. 

The term "variant" (or "analog") refers to any polypeptide differing from naturally 

occurring polypeptides by amino acid insertions, deletions, and substitutions, created usjng, e g., 

recombinant DNA tecliniques. Guidance in determining which amino acid residues may be 

replaced, added or deleted without abolishing activities of interest, may be found by comparing 

1 0 the sequence of the particular polypeptide with that of homologous peptides and minimi2dng the 

number of amino acid sequence changes made in regions of high homology (conserved regions) 

or by replacing anuno acids with consensus sequence. 

Alternatively, recombinant variants encoding these same or similar polypeptides may be 

synthesized or selected by making use of the "redundancy" in the genetic code. Various codon 

1 5 substitutions, such as the silent changes which produce various restriction sites, may be 

introduced to optimize cloning into a plasmid or viral vector or expression in a particular 

prokaryotic or eukaryotic system. Mutations in the polynucleotide sequence may be reflected in 

the polypeptide or domains of other peptides added to the polypeptide to modify the properties of 

any part of the polypeptide, to change characteristics such as ligand-binding affinities^ interchain 

20 affinities, or degradation/turnover rate. 

Preferably, amino acid "substitutions" are the result of replacing one amino acid with 

another amino acid having similar structural and/or chemical properties, i.e., conservative amino 

acid replacements. "Conservative" amino acid substitutions may be made on the basis of 

similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic 

25 nature of die residues involved. For example, nonpolar (hydrophobic) amino acids include 

alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan, and methionine; poiar 

neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and 

giutamine; positively charged (basic) amino acids include arginine, lysine, and histidine; and 

negatively charged (acidic) amino acids include aspartic acid and glutamic acid. "Insertions" or 

30 "deletions" are preferably in the range of about 1 to 20 amino acids, more preferably 1 to 10 

amino acids. The variation allowed may be experimentally determined by systematically making 

insertions, deletions, or substitutions of amino acids in a polypeptide molecule using 

recombinant DNA techniques and assaying the resulting recombinant variants for activity. 

Altematively, where alteration of function is desired, insertions, deletions or 

35 non-conservative alterations can be engineered to produce altered polypeptides. Such alterations 
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can, for example, alter one or more of the biological fonctions or biochemical characteristics of 
the polypeptides of the invention. For example, such alterations may change polypeptide 
characteristics such as ligand-binding alTinities, interchain affinities, or degradation/turnover 
rate. Further, such alterations can be selected so as to generate polypeptides that are better suited 
5 for expression, scale up and the like in the host cells chosen for expression. For example, 
cysteine residues can be deleted or substituted with another amino acid residue in order to 
eliminate disulfide bridges. 

The tenns "purified" or "substaiatially purified" as used herein denotes that the irCdicated 
nucleic acid or polypeptide is present in the substantial absence of other biological 
10 macromolecules, e.g., polynucleotides, proteins, and the like. In one embodiment, the 

polynucleotide or polypeptide is purified such that it constitutes at least 95% by weight, more 
preferably at least 99% by weight, of the indicated biological macromolecules present (but water, 
buffers, and other small molecules, especially molecules having a molecular weight of less than 
1000 daltons, can be present). 
5 The term "isolated" as used herein refers to a nucleic acid or polypeptide separated from 

at least one other component (e.g., nucleic acid or polypeptide) present with the nucleic acid or 
polypeptide in its natural source. In one embodiment, the nucleic acid or polypeptide is found in 
the presence of (if anyfliing) only a solvent, buffer, ion, or otlier component normally present in a 
solution of the same. The terms "isolated" and "purified" do not encompass nucleic acids or 
polypeptides present in their natural source. 

The term "recombinant," when used herein to refer to a polypeptide or protein, means ' 
that a polypeptide or protein is derived from recombinant (e.g. , microbial, insect, or mammalian) 
expression systems. "Microbial" refers to recombinant polypeptides or proteins made in 
bacterial or fungal (e.g.. yeast) expression systems. As a product, "recombinant microbial" 
defines a polypeptide or protein essentially free of native endogenous substances and 
unaccompanied by associated native glycosylation. Polypeptides or proteins expressed in most 
bacterial cultures, e.g., E. coli, will be free of glycosylation modifications; polypeptides or 
proteins expressed in yeast will have a glycosylation pattern in general different from those 
expressed in mammalian cells. 

The term "recombinant expression vehicle or vector" refers to a plasmid or phage or virus 
or vector, for expressing a polypeptide from a DNA (RNA) sequence. An expression vehicle can 
comprise a transcriptional unit comprising an assembly of (1) a genetic element or elements 
having a regulatory role in gene expression, for example, promoters or enhancers, (2) a structural 
or coding sequence which is transcribed into mRNA and translated into protein, and (3) 
appropriate transcription initiation and termination sequences. Structural units intended f^r use 

11 
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in yeast or eukaryotic expression systems preferably include a leader sequence enabling 

extracellular secretion of translated protein by a host cell. Alternatively, where recombinant 

protein is expressed without a leader or transport sequence, it may include an amino terminal 

methionine residue. This residue may or may not be subsequently cleaved from the expressed 

5 recombinant protein to provide a final product. 

The term ^'recombinant expression system" means host cells which have stably integrated 

a recombinant transcriptional unit into chromosomal DNA or carry the recombinant 

transcriptional unit extrachromosomally. Recombinant expression systems as defined herein will 

express heterologous polypeptides or proteins upon induction of the regulatory elements linked 

10 to the DNA segment or synthetic gene to be expressed. This term also means host cells which 
have stably integrated a recombinant genetic element or elements having a regulatory role in 
gene expression, for example, promoters or enhancers. Recombinant expression systems as 
defined herein will express polypeptides or proteins endogenous to the cell upon induction of the 
regulatory elements linked to the endogenous DNA segment or gene to be expressed. The cells 

1 5 can be prokaryotic or eukaryotic. 

The term "secreted" includes a protein that is transported across or through a membrane, 
including transport as a result of signal sequences in its amino acid sequence when it is expressed 
in a suitable host cell. "Secreted'* proteins include without limitation proteins secreted wholly 
(e.g., soluble proteins) or partially (e,g,^ receptors) from the cell in which they are expressed. 

20 "Secreted" proteins also include without limitation proteins that are transported across the ^ , 
membrane of the endoplasmic reticulum. *'Secrctc&' proteins are also intended to include 
proteins containing non-typical signal sequences (e.g. InterIeukin-1 Beta, see Krasney, P.A. and 
Young, P,R. (1992) Cytokine 4(2): 1 34 -143) and factors released from damaged cells (e.g. 
Interleukin-1 Receptor Antagonist, see Arend, W.P. et al. (1998) Anna. Rev. Immunol, 

25 16:27-55) 

Where desired, an expression vector may be designed to contain a "signal or leader 
sequence" which will direct the polypeptide through the membrane of a cell. Such a sequence 
may be naturally present on the polypeptides of the present invention or provided from 
heterologous protein sources by recombinant DNA techniques. 

30 The tenn "stringent" is used to refer to conditions that are commonly understood in the 

art as stringent. Stringent conditions can include highly stringent conditions (i.e., hybridization 
to filter-bound DNA in 0.5 M NaHP04, 7% sodium dodecyl sulfate (SDS), 1 mM EDTA at 
65°C, and washing in O.IX SSC/0.1% SDS at 68^C), and nioderately stringent conditions (i.e., 
washing in 0.2X SSC/0.1% SDS at 42*'C). Other exemplary hybridization conditions are ^ 

35 described herein in the examples. 
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In instances of hybridization of deoxyoligonucleotides, additional exemplary stringent 
hybridization conditions include washing in 6X SSC/0.05% sodium pyrophosphate at 37°C (for 
14-base oligonucleotides), 48°C (for 17-base oligos), 55°C (for 20-base oligonucleotides), and 
60°C (for 23-base oligonucleotides), 
5 As used herein, "substantially equivalent" can refer both to nucleotide and amino acid 

sequences, for example a mutant sequence, that varies from a reference sequence by one or more 
substitutions, deletions, or additions, the net effect of which does not result in an adverse, 
functional dissimilarity between the reference and subject sequences. Typically, such a 
substantially equivalent sequence varies from one of those listed herein by no more than about 
1 0 35% {Le, , the number of individual residue substitutions, additions, and/or deletions in a 

substantially equivalent sequence, as compared to the corresponding reference sequence, divided 
by the totaJ number of residues in the substantially equivalent sequence is about 0.35 or less). 
Such a sequence is said to have 65% sequence identity to the listed sequence. In one 
embodiment, a substantially equivalent, e.g,, mutant, sequence of the invention varies from a 
1 5 listed sequence by no more than 30% (70% sequence identity); in a variation of this embodiment, 
by no more than 25% (75% sequence identity); and in a further variation of this embodiment, by 
no more than 20% (80% sequence identity) and in a frirther variation of this embodiment, by no 
more than 1 0% (90% sequence identity) and in a ftirther variation of this embodiment, by no 
more that 5% (95% sequence identity). Substantially equivalent, e,g,, mutant, amino acid 
20 sequences according to the invention preferably have at least 80% sequence identity with a fe|ed 
amino acid sequence, more preferably at least 90% sequence identity. Substantially equivalent 
nucleotide sequences of the invention can have lower percent sequence identities, taking into 
account, for example, the redundancy or degeneracy of the genetic code. Preferably, nucleotide 
sequence has at least about 65% identity, more preferably at least about 75% identity, and most 
25 preferably at least about 95% identity. For the purposes of the present invention, sequences 
having substantially equivalent biological activity and substantially equivalent expression 
characteristics are considered substantially equivalent. For the purposes of determining 
equivalence, truncation of the mature sequence (e.g., via a mutation which creates a spiuious 
stop codon) should be disregarded. Sequence identity may be detennined, e.g., using the Jotun 
30 Hein method (Hein, J. (1990) Methods Enzymol. 1 83:626-645). Identity between sequences can 
also be determined by other methods known in the art, e.g. by varying hybridization conditions. 

The term "totipotent" refers to the capability of a cell to differentiate into all of the cell 
types of an adult organism. 

The term "transformation" means introducing DNA into a suitable host cell so that the 
35 DNA is replicable, either as an extrachromosomal element, or by chromosomal integration. The 
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term "transfection" refers to the taking up of an expression vector by a suitable host cell-, whether 

or not any coding sequences are in fact expressed. The term "infection" refers to the introduction 
of nucleic acids into a suitable host cell by use of a virus or viral vector. 

As used herein, an "uptake modulating fragment," UMF, means a series of nucleotides 
5 which mediate the uptake of a linked DNA fragment into a cell. UMFs can be readily identified 
using known UMFs as a target sequence or target motif with the computer-based systems 
described below. The presence and activity of a UMF can be confirmed by attaching the 
suspected UMF to a marker sequence. The resulting nucleic acid molecule is then incubated 
with an appropriate host under appropriate conditions and the uptake of the marker sequence is 
1 0 determined. As described above, a UMF will increase the frequency of uptake of a linlced 
marker sequence. 

Each of the above terms is meant to encompass all that is described for each, unless the 
context dictates otherwise. 



20 
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15 4.2 NUCLEIC ACIDS OF THE INVENTION 

Nucleotide sequences of the invention are set forth in the Sequence Listing. 
The isolated polynucleotides of the invention include a polynucleotide comprising the 
nucleotide sequences of SEQ ID N0.1-I786 and 3573-5358 ; a polynucleotide encoding any one 
of the peptide sequences of SEQ ID NO: 1787-3572 and 5359-7144; and a polynucleotide 
comprising the nucleotide sequence encoding the mature protein coding sequence of the 
polypeptides of any one of SEQ ID NO:1787-3572 and 5359-7144. The polynucleotides of the 
present invention also include, but are not limited to, a polynucleotide that hybridizes under 
stringent conditions to (a) the complement of any of the nucleotides sequences of SEQ ID NO:I- 
1786 and 3573-5358 ; (b) nucleotide sequences encoding any one of the amino acid sequences 
set forth in the Sequence Listing; (c) a polynucleotide which is an allelic variant of any 
polynucleotide recited above; (d) a polynucleotide which encodes a species homolog of any of 
the proteins recited above; or (e) a polynucleotide that encodes a polypeptide comprising a 
specific domain or truncation of the polypeptides of SEQ ID NO: 1 787-3 572 and 5359-7144. 
Domains of interest may depend on the nature of the encoded polypeptide; e.g., domains in 
receptor-like polypeptides include ligand-binding, extracellular, transmembrane, or cytoplasmic 
domains, or combinations thereof; domains in immunoglobulin-like proteins include the variable 
immunoglobulin-like domains; domains in enzyme-like polypeptides include catalytic and 
substrate binding domains; and domains in ligand polypeptides include receptor-binding 
domains. 
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The polynucleotides of the invention include naturally occurring or wholly or paptially 
synthetic DNA, e.g., cDNA and genomic DNA, and RNA, e.g., niRNA. The polynucleotides 
may include all of the coding region of the cDNA or may represent a portion of the coding 
region of the cDNA. 

5 The present invention also provides genes corresponding to the cDNA sequences disclosed 

herein. The corresponding genes can be isolated in accordance with known methods using the 
sequence information disclosed herein. Such methods include the preparation of probes or primers 
from tile disclosed sequence information for identification and/or amplification of genes in^ 
appropriate genomic libraiies or other sources of genomic materials. Further 5' and 3' sequence can 

1 0 be obtained using methods known in the art. For example, full length cDNA or genomic DNA that 
corresponds to any of the polynucleotides of SEQ ID NO: 1-1786 and 3573-5358 can be obtained 
by screening appropriate cDNA or genomic DNA libraries imder suitable hybridization conditions 
using any of the polynucleotides of SEQ ID NO: 1 - 1 786 and 3573-5358 or a portion thereof as a 
probe. Alternatively, the polynucleotides of SEQ ID NO: 1*1786 and 3573-5358 may be used as the 

1 5 basis for suitable primer(s) that allow identification and/or amplification of genes in appropriate 
genomic DNA or cDNA libraries. 

The nucleic acid sequences of the invention can be assembled from ESTs and sequences 
(including cDNA and genomic sequences) obtained from one or more public databases, such as 
dbEST, gbpri, and UniGene. The EST sequences can provide identifying sequence information, 
20 representative fragment or segment information, or novel segment information for the full-len^' 
gene. 

The polynucleotides of the invention also provide polynucleotides including nucleotide 
sequences that are substantially equivalent to the polynucleotides recited above. Polynucleotides 
according to the invention can have, e.g., at least about 65%, at least about 70%, at least about 

25 75%, at least about 80%, more typically at least about 90%, and even more typically at least 
about 95%, sequence identity to a polynucleotide recited above. 

Included within the scope of the nucleic acid sequences of the invention are nucleic acid 
sequence fragments that hybridize under stringent conditions to any of the nucleotide sequences 
of SEQ ID NO:l-1786 and 3573-5358, or complements thereof, wiiich fragment is greater than 

30 about 5 nucleotides, preferably 7 nucleotides, more preferably greater than 9 nucleotides and 

most preferably greater than 17 nucleotides. Fragments of, e.g. 15, 17, or 20 nucleotides or more 
that are selective for (i.e. specifically hybridize to any one of the polynucleotides of the 
invention) are contemplated. Probes capable of specifically hybridizing to a polynucleotide can 
differentiate polynucleotide sequences of the invention from other polynucleotide sequences in 

15 
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the same family of genes or can differentiate human genes from genes of other species, and are 
preferably based on unique nucleotide sequences. 

The sequences falling witliin the scope of the present invention are not limited to these 
specific sequences, but also include allelic and species variations thereof. Allelic and species 
5 variations can be routinely determined by comparing the sequence provided SEQ ID NO: 1-1786 
and 3573-5358, a representative fragment thereof, or a nucleotide sequence at least 90% idenlical, 
preferably 95% identical, to SEQ ID NO:l-1786 and 3573-5358 with a sequence from another 
isolate of the same species. Furthermore, to accommodate codon variability, the invention includes 
nucleic acid molecules coding for the same amino acid sequences as do the specific ORPs disclosed 
1 0 herein. In other words, in the coding region of an ORF, substitution of one codon for another codon 
that encodes the same amino acid is expressly contemplated. 

The nearest neighbor or homology result for the nucleic acids of the present invention, 
including SEQ ID NO:l-1786 and 3573-5358, can be obtained by searching a.database using an 
algorithm or a progiam. Preferably, a BLAST which stands for Basic Local Alignment Search Tool 
1 5 is used to search for local sequence alignments (Altshul. S.F. J Mol. Evol. 36 290-300 (1993) and 
Altschul S.F. et al. J. Mol. Biol. 21 :403-410 (1990)). Alternatively a FASTA version 3 search 
against Genpept, using Fastxy algoritlim. 

Species homologs (or orthologs) of the disclosed polynucleotides and proteins are also 
provided by the present invention. Species homologs may be isolated and identified by making 
20 suitable probes or primers from the sequences provided herein and screening a suitable nuclejc 
acid source from the desired species. 

The invention also encompasses allelic variants of the disclosed polynucleotides or 
proteins; that is, naturally-occurring alternative forms of the isolated polynucleotide which also 
encode proteins which are identical, homologous or related to that encoded by the 

25 polynucleotides. 

The nucleic acid sequences of the invention are further directed to sequences which 
encode variants of the described nucleic acids. These amino acid sequence variants may be 
prepared by methods known in the art by introducing appropriate nucleotide changes into a 
native or variant polynucleotide. There are two variables in the construction of amino acid 

30 sequence variants: the location of the mutation and the nature of the mutation. Nucleic acids 
encoding the amino acid sequence variants are preferably constructed by mutating the 
polynucleotide to encode an amino acid sequence that does not occur in nature. These nucleic 
acid alterations can be made at sites that differ in the nucleic acids from different species 
(variable positions) or in highly conserved regions (constant regions). Sites at such locations 

3 5 will typically be modified in series, e.g., by substituting first with conservative choices Ce.g. , 
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hydrophobic amino acid to a different hydrophobic amino acid) and then with more distaat 

choices (e.g., hydrophobic amino acid to a charged amino acid), and then deletions or insertions 

may be made at the target site. Amino acid sequence deletions generally range from about 1 to 

30 residues, preferably about 1 to 10 residues, and are typically contiguous. Amino acid 

5 insertions include aniino- and/or carboxyl -terminal fusions ranging in length from one to one 

hundred or more residues, as well as intrasequence insertions of single or multiple amino acid 

residues, Intrasequence insertions may range generally from about 1 to 10 amino residues, 

preferably from 1 to 5 residues. Examples of terminal insertions include the heterologous signal 

sequences necessary for secretion or for intracellular targeting in different host cells and 

1 0 sequences such as FLAG or poly-histidine sequences useful for purifying the expressed protein. 

In a preferred method, polynucleotides encoding the novel amino acid sequences are 
changed via site-directed mutagenesis. This method uses oligonucleotide sequences to alter a 
polynucleotide to encode the desired amino acid variant, as well as sufficient adjacent 
nucleotides on both sides of the changed amino acid to form a stable duplex on either side of the 

15 site of being changed. In general, the techniques of site-directed mutagenesis are well known to 
those of skill in the art and this technique is exemplified by publications such as, Edelman et al., 
DNA 2: 1 83 (1983). A versatile and efficient method for producing site-specific changes in a 
polynucleotide sequence was published by Zoller and Smith, Nucleic Acids Res, 10:6487-6500 
(1 982). PGR may also be used to create aimino acid sequence variants of the novel nucleic acids. 

20 When small amounts of template DNA are used as starting material, primer(s) that differs ^ ^ 
slightly in sequence from the con'esponding region in the template DNA can generate the desired 
amino acid variant. PGR amplification results in a population of product DNA fragments that 
differ from the polynucleotide template encoding the polypeptide, at the position specified by the 
primer. The product DNA fragments replace the corresponding region in the plasmid and this 

25 gives a polynucleotide encoding the desired amino acid variant. 

A further technique for generating amino acid variants is the cassette mutagenesis 
technique described in Wells et aL, Gene 34:315 (1985); and other mutagenesis techniques well 
known in the art, such as, for example, the techniques in Sambrook et al., supra, and Current 
Protocols in Molecular Biology, Ausubel et al. Due to the inlierent degeneracy of tlie genetic 

30 code, other DNA sequences which encode substantially the same or a functionally equivalent 
amino acid sequence may be used in the practice of the invention for the cloning and expression 
of these novel nucleic acids. Such DNA sequences include those which are capable of 
hybridizing to the appropriate novel nucleic acid sequence under stringent condidons. 
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Polynucleotides encoding preferred polypeptide truncations of the invention can be used 

to generate polynucleotides encoding chimeric or fusion proteins comprising one or more 

domains of the invention and heterologous protein sequences. 

The polynucleotides of the invention additionally include the complement of any of the 

5 polynucleotides recited above. The polynucleotide can be DNA (genomic, cDNA, amplified, or 

synthetic) or RNA. Methods and algorithms for obtaining such polynucleotides are weirknown 

to those of sldll in the art and can include, for example, methods for detennining hybridization 

conditions that can routinely isolate polynucleotides of the desired sequence identities, ] 

In accordance with the invention, polynucleotide sequences comprising the mature 

10 protein coding sequences corresponding to any one of SEQ ID NO: 1-1 786 and 3573-5358, or 

functional equivalents thereof, may be used to generate recombinant DNA molecules that direct 

the expression of that nucleic acid, or a functional equivalent thereof, in appropriate host cells. 

Also included are the cDNA inserts of any of the clones identified herein. 

A polynucleotide according to the invention can be joined to any of a variety of other 

15 nucleotide sequences by well-established recombinant DNA techniques (see Sambrook J et al. 

(1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, NY). Useful 

nucleotide sequences for joining to polynucleotides include an assortment of vectors, e.g., 

plasmids, cosmids, lambda phage derivatives, phagemids, and the like, that are well known in the 

art. Accordingly, the invention also provides a vector including a polynucleotide of the 

20 invention and a host cell containing the polynucleotide. In general, the vector contains an ojcigin 

of replication functional in at least one organism, convenient restriction endonuclease sites, and a 

selectable marker for the host cell. Vectors according to the invention include expression 

vectors^ rephcation vectors, probe generation vectors, and sequencing vectors. A host cell 

according to the invention can be a prokaryotic or eukaryotic cell and can be a unicellular 

25 organism or part of a multicellular organism. 

The present invention further provides recombinant constructs comprising a nucleic acid 

having any of the nucleotide sequences of SEQ ID NO: 1 -1 786 and 3573-5358 or a fragment 

thereof or any other polynucleotides of the invention. In one embodiment, the recombinant 

constructs of the present invention comprise a vector, such as a plasmid or viral vector, into 

3 0 which a nucleic acid having any of the nucleotide sequences of SEQ ID NO: 1 - i 7S6 and 3 573- 

5358 or a fragment thereof is inserted, in a forward or reverse orientation. In the case of a vector 

comprising one of the ORFs of the present invention, the vector may further comprise regulatory 

sequences, including for example, a promoter, operably linked to the ORE. Large numbers of 

suitable vectors and promoters are known to those of skill in the art and are commercially 

35 available for generating the recombinant constructs of the present invention. The folio wJig 
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vectors are provided by way of example. Bacterial: pBs, phagescript, PsiX174, pBluescript SK, 
pBs KS, pNH8a, pNH16a, pNHlSa, pNH46a (Stratagene); pTrc99A, pKK223-3, pKBC233-3, 
pDR540, pRlTS (Pharmacia). Eukaryotic: pWLneo, pSV2cat, pOG44, PXTI, pSG (Stratagene) 
pSVK3, pBPV, pMSG, pSVL (Pharmacia). 
5 The isolated polyaucleotide of the invention may be operably linlced to an expression 

control sequence such as the pMT2 or pED expression vectors disclosed in Kaufman et aL; 
Nucleic Acids Res. 19, 4485-4490 (1991), in order to produce the protein recombinantly. Many 
suitable expression control s.^quences are known in the art. General methods of expressing 
recombinant proteins are also known and are exemplified in R. Kaufman, Methods in 
10 Enzymology 185, 537-566 (1990), As defined herein "operably linked" means that the isolated 
polynucleotide of the invention and an expression control sequence are situated within a vector 
or cell in such a way that the protein is expressed by a host cell which has been transformed 
(transfected) with the ligated polynucleotide/expression control sequence. 

Promoter regions can be selected from any desired gene using CAT (chloramphenicol 
1 5 transferase) vectors or other vectors with selectable markers. Two appropriate vectors are 
pKK232-8 and pCM7, Particular named bacterial promoters include lad, lacZ, T3, T7, gpt, 
lambda PR, and trc, Bulcaryotic promoters include CMV immediate early, HSV thymidine 
kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein-I. Selection of 
the appropriate vector and promoter is well within the level of ordinary skill in the art. 
20 Generally, recombinant expression vectors will include origins of replication and selectable 

TT^arkers permitting transformation of the host cell, e.g., the ampicillin resistance gene of E. coli 
and cerevisiae TRPl gene, and a promoter derived from a highly-expressed gene to direct 
transcription of a downstream stRictural sequence. Such promoters can be derived from operons 
encoding glycolytic enzymes such as 3-phosphoglycerate kinase (PGK), a-factor, acid 
25 phosphatase, or heat shock proteins, among others. The heterologous structural sequence is 
assembled in appropriate phase with translation initiation and termination sequences, and 
preferably, a leader sequence capable of directing secretion of translated protein into tlie 
periplasmic space or extracellular medium. Optionally, the heterologous sequence can encode a 
flision protein including an amino terminal identification peptide imparting desired 
30 characteristics, e.g., stabilization or simplified purification of expressed recombinant product. 
Useful expression vectors, for bacterial use are constructed by inserting a structural DN A 
sequence encoding a desired protein together with suitable translation initiation and termination 
signals in operable reading phase with a functional promoter. The vector will comprise one or 
more phenotypic selectable markers and an origin of replication to ensure maintenance of the 
35 vector and to, if desirable, provide amplification within the host. Suitable prokaryotic hc^ts for 
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txansformation include E. coli. Bacillus subtiliSy Salmonella typhimurium and various species 
within the genera Pseudomonas, Streptomyces, and Staphylococctts^ although others may also be 
employed as a matter of choice. 

As a representative but non-limiting example, useful expression vectors for bacterial use 
5 can comprise a selectable marker and bacterial origin of replication derived from commercially 
available plasmids comprising genetic elements of the well known cloning vector pBR322 
(ATCC 37017). Such commercial vectors include, for example, pKK223-3 (Pharmacia fine 
Chemicals, Uppsala, Sweden) and GEM 1 (Promega Biotech, Madison, WI, USA). These 
pBR322 "backbone" sections are combined with an appropriate promoter and the structural 
10 sequence to be expressed. Following transformation of a suitable host strain and growth of the 
host strain to an appropriate cell density, the selected promoter is induced or derepressed by 
appropriate means (eg., temperature shift or chemical induction) and cells are cultured for an 
additional period. Cells are typically harvested by centrifugation, disrupted by physical or 
chemical means, and the resulting crude extract retained for further purification. 
1 5 Polynucleotides of the invention can also be used to induce immune responses. For 

example, as described in Fan et al, Nat. BiotecJi, 17:870-872 (1999), incorporated herein by 
reference, nuchic acid sequences encoding a polypeptide may be used to generate antibodies 
against the encoded polypeptide following topical administration of naked plasmid DNA or 
follovs^ng injection, and preferably intramuscular injection of the DNA. The nucleic acid 
20 sequences are preferably inserted in a recombinant expression vector and may be in the form, of 
naked DNA, 

43 ANTISENSE 

Another aspect of the invention pertains to isolated antisense nucleic acid molecules that 
are hybridizable to or complementary to the nucleic acid molecule comprising the nucleotide 
sequence of SEQ ID NO:l-I786 and 3573-5358, or fragments, analogs or derivatives thereof. 
An "antisense" nucleic acid comprises a nucleotide sequence that is complementary to a "sense" 
nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded 
cDNA molecule or complementary to an mRNA sequence. In specijfic aspects, antisense nucleic 
acid molecules are provided that comprise a sequence complementary to at least about 10, 25, 
50, 1 00, 250 or 500 nucleotides or an entire coding strand, or to only a portion thereof. Nucleic 
acid molecules encoding fragments, homologs, derivatives and analogs of a protein of any of 
SEQ ID NO; 1787-3572 and 5359-7144 or antisense nucleic acids complementary to a nucleic 
acid sequence of SEQ ID NO:l-1786 and 3573-5358 are additionally provided. 

i 
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In one embodiment, an aintisense nucleic acid molecuie is antisense to a "coding, region" 
of the coding strand of a nucleotide sequence of the invention. The term "coding region" refers 
to the region of the nucleotide sequence comprising codons which are translated into amino acid 
residues. In another embodiment, the antisense nucleic acid molecule is antisense to a 
5 *'noncoding region" of the coding strand of a nucleotide sequence of the invention. The teiro 
"noncoding region" refers to 5' and 3' sequences which flanlc the coding region that are not 
translated into amino acids (i.e., also referred to as 5* and 3' untranslated regions). 

Given the coding strand sequences encoding a nucleic acid disclosed herein {e.g,, SEQ ID 
NO:l-1786 and 3573-5358 , antisense nucleic acids of the invention can be designed according 

10 to the rules of Watson and Crick or Hoogsteen base pairing. The antisense nucleic acid molecule 
can be complementary to the entire coding region of a mRNA, but more preferably is an 
oligonucleotide that is antisense to only a portion of the coding or noncoding region of a mRNA. 
For example, the antisense oligonucleotide can be complementary to the region surrounding the 
translation start site of a mRNA. An antisense ohgonucleotide can be, for example, about 5, 10, 

15 15, 20, 25, 30, 35, 40, 45 or 50 nucleotides in length. An ai^tisense nucleic acid of the invention 
can be constructed using chemical synthesis or enzymatic ligation reactions using procedures 
known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can 
be chemically synthesized using naturally occurring nucleotides or variously modified 
nucleotides designed to increase the biological stability of the molecules or to increase the 

20 physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., ^ ' 
phosphorothioate derivatives and acridine substituted nucleotides can be used. 

Examples of modified nucleotides that can be used to generate the antisense nucleic acid 
include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 
4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl- 

25 2-thiouridine, 5-carboxymethylaminomethyluraciI, dihydrouracil, beta-D-galactcsylqueosine, 
inosine, N6-isopentenyladenine, 1 -methylguanine, 1-methylinosine, 2,2-dimethyiguanine, 

2- methyladenine, 2-methylguanine, 3 -methyl cytosine, 5-methylcytosine, N6-ademne, 
7-methylguamne, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, 
beta-D-mannosyiqueosine, 5'-methoxycarboxymethyluracil, 5-methoxyuracil, 

30 2-methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, 
queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, 
uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil, 

3- (3-amino-3-N-'2''Carboxypropyl) uracil, (acp3)w, and 2,6-dianiinopurine. Alternatively, the 
antisense nucleic acid can be produced biologically using an expression vector into which a 

35 nucleic acid has been subcloned in an antisense orientation (/.e., RNA transcribed from tffe 
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inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest; 
described fiuther in the following subsection). 

The antisense nucleic acid molecules of the invention are typically administered to a 
subject or generated in situ such that they hybridize with or bind to cellular mRNA and/or 
5 genomic DN A encoding a protein accordmg to the invention to thereby inliibit expression of the 
protein, e,g. , by inhibiting transcription and/or translation. The hybridization can be by 
conventional nucleotide complementarity to form a stable duplex, or, for example, in the ^;ase of 
an antisense nucleic acid molecule that binds to DN A duplexes, through specific interactions in 
the major groove of the double helix. An example of a route of administration of antisense 

10 nucleic acid molecules of the invention includes direct injection at a tissue site. Alternatively, 
antisense nucleic acid molecules can be modified to target selected cells and then administered 
systemically. For example, for systemic administration, antisense molecules can be modified 
such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., 
by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface 

1 5 receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using 
the vectors described herein. To achieve sufficient intracellular concentrations of antisense 
molecules, vector constructs in which the antisense nucleic acid molecule is placed under the 
control of a strong pol 11 or pol 10 promoter are preferred. 

In yet another embodiment, the antisense nucleic acid molecule of the invention is an 

20 a-anomeric nucleic acid molecule. An a-anomeric nucleic acid molecule forms specific 

double-stranded hybrids with complementary RNA in which, contrary to the usual p-imits, the 
strands run parallel to each other (Gaultier et ciL (1987) Nucleic Acids Res 15: 6625-6641), The 
antisense nucleic acid molecule can also comprise a 2 -o-methylribonucleotide (Inoue e( al. 
(1987) Nucleic Acids Res 15: 6131-6148) or a chimeric RNA -DNA analogue (Inoue et aL (1987) 

25 FEES Lett 215: 327-330). 

4.4 RIBOZYMES AND PNA MOIETIES 

In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. 
Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of cleaving a 
30 single-stranded nucleic acid, such as a mRNA, to which they have a complementary region. 
Thus, ribozymes (e.g., hammerhead ribozymes (described in Haselhoff and Gerlach (1988) 
Nature 334:585-591)) can be used to catalytically cleave a mRNA transcripts to thereby inhibit 
translation of a mRNA. A ribozyme having specificity for a nucleic acid of the invention can be 
designed based upon the nucleotide sequence of a DNA disclosed herein (i.e., SEQ ID NO:l- 
35 1786 and 3573-5358). For example, a derivative of a Tetrahymena L-19 IVS RNA can be 
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constructed in which the nucleotide sequence of the active site is complementary to tiie ' 

nucleotide sequence to be cleaved in a SECX-encoding niRNA. See, e.g., Cech et al. U.S. Pat. 
No. 4,987,071; and Cech et al U.S. Pat. No. 5,116,742. Alternatively, SECX mllNA caii be 
used to select a cataiytic RNA having a specific ribonuclease activity from a pool of RNA 
5 molecules. See, BarteJ ^/ aL,{\99y) Science 2()\\\A\\ A A\%, 

Alternatively, gene expression can be inliibited by targeting nucleotide sequences' 
complementary to the regulatory region {e.g., promoter and/or enhancers) to fonn triple helical 
structures that prevent transcription of the gene in target cells. See generally, Helene. (1991) 
Anticancer Drug Des, 6: 569-84; Helene. et al. ( 1 992) Ann, N, K Acad Sci. 660:27-36; and 
1 0 Maher (1 992) Bioassays 1 4: 807-1 5. 

In various embodiments, the nucleic acids of the invention can be modified at tlie base 
moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or 
solubility of the molecule. For example, tlie deoxyribose phosphate backbone of the nucleic 
acids can be modified to generate peptide nucleic acids (see Hyrup et al. (1996) BioorgMed 
1 5 Chem 4: 5-23). As used herein, the terms "peptide nucleic acids'* or "PNAs" refer to nucleic acid 
mimics, e.g., DNA mimics, in which the deoxyribose phosphate backbone is replaced by a 
pseudopeptide backbone and only the four natural nucleobases are retained. The neutral 
backbone of PNAs has been shown to allow for specific hybridization to DNA and RNA under 
conditions of low ionic strength. The synthesis of PNA oligomers can be performed using 
20 standard solid phase peptide synthesis protocols as described in Hyrup et aL (1996) above; 
Perry-O^Keefe et aL (1996) PNAS 93: 14670-675. 

PNAs of the invention can be used in therapeutic and diagnostic applications. For 
example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of 
gene expression by, e.g., inducing transcription or translation airest or inhibiting replication, 
25 PNAs of the invention can also be used, e.g. , in the analysis of single base pair mutations in a J 
gene by, e.g., PNA directed PGR clamping; as artificial restriction enzymes when used in 
combination with other enzymes, e.g., SI nucleases (Hyrup B, (1996) above); or as probes or 
primers for DNA sequence and hybridization (Hyrup et al (1996), above; Perry-O'Keefe (1996), 
above). 

30 In another embodiment, PNAs of the invention can be modified, e.g., to enhance their 

stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by the 
formation of PNA-DNA chimeras, or by the use of liposomes or other techniques of drug 
delivery known in the art. For example, PNA-DNA chimeras can be generated that may 
combine the advantageous properties of PNA and DNA. Such cliimeras allow DNA recognition 

35 enzymes, e.g, RNase H and DNA polymerases, to interact with the DNA portion while th*e PNA 
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portion would provide high binding affinity and specificity. PNA-DNA chimeras can^e linlced 
using linkers of appropriate lengths selected in terms of base stacking, number of bonds between 
the nucleobases, and orientation (Hyrup (1996) above). The syntliesis of PNA-DNA chimeras 
can be performed as described in Hymp (1996) above and Finn et al (1996) Nucl Acids Res 24: 
5 3357-63, For example, a DNA chain can be synthesized on a soUd support using standard 
phosphoramidite coupling chemistry, and modified nucleoside analogs, e.g., 
5L(4_n]ethoxytrityi)amino-5'-deoxy-thymidine phosphoramidite, can be used between the PNA 
and the 5* end of DNA (Mag er al (1989) Nucl Acid Res 17: 5973-88), PNA monomers are then 
coupled in a stepwise manner to produce a chuneric molecule with a 5' PNA segment and a 3' 
10 DNA segment'(Finn et al (1996) above). Alternatively, chimeric molecules can be synthesized 
with a 5' DNA segment and a 3' PNA segment. See, Petersen et aL (1975) Bioorg Med Chem 
Lett 5: 1119-11124. 

In other embodiments, the oligonucleotide may include other appended groups such as 
peptides {e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the 

15 ceii membrane (see, e,g, LGtsmgcr et aL, 1989, Froc, Natl Acad, Set. U,S,A, 86:6553-6556; 

Lemaitre etal, 1987, Pro^:. Natl Acad 5c/. 84:648-652; PCT Publication No. W088/09810) or 
the blood-brain barrier (see, e.g., PCT Publication No. W089/10134). In addition, 
oligonucleotides can be modified with hybridization triggered cleavage agents (See, e.g., Krol et 
al, 1988, BioTechniques 6:958-976) or intercalating agents. (See, e.g, Zon, 1988, Fharm, Res. 

20 5: 539-549). To this end, the oligonucleotide may be conjugated to another molecule, e.g.^h-^ 

peptide, a hybridization triggered cross-linking agent, a transput l agent, a hybridization-triggered 
cleavage agent, etc. 

4.5 HOSTS 

25 The present invention further provides host cells genetically engineered to contain the 

polynucleotides of the invention. For example, such host cells may contain nucleic acids of the 
invention introduced into the host cell using known transformation, transfection or infection 
methods. The present invention still further provides host cells genetically engineered to express 
the polynucleotides of tlie invention, wherein such polynucleotides are in operative association 

30 with a regulatory sequence heterologous to the host cell which drives expression of the 
polynucleotides in tiae cell. 

Knowledge of nucleic acid sequences allows for modification of cells to permit, or 
increase, expression of endogenous polypeptide. Cells can be modified (e.g., by homologous 
recombination) to provide increased polypeptide expression by replacing, in whole or in |)art, the 

35 naturally occurring promoter with all or part of a heterologous promoter so that the cells express 
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the polypeptide at higher levels. The heterologous promoter is inserted in such a mann^ that it 
is operatively linked to tiie encoding sequences. See, for example, PCT International Publication 
No. WO94/12650, PCX International Publication No. WO92/20808, and PCT International 
Publication No. W09 1/09955. It is also contemplated that, in addition to heterologous promoter 
DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifLmctional CAD gene which 
encodes cai-bamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to coding 
sequence, amplification of the marker DNA by standard selection methods results in co-" 
amplification of the desired protein coding sequences in the cells. 

The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a lower 
eukaryotic host cell, such as a yeast cell, or the host cell can be a prokaryotic coll, such as a 
bacterial cell. Introduction of the recombinant construct into the host cell can be effected by 
calciimi phosphate transfection, DEAE, dextran mediated transfection, or electroporation (Davis, 
L. et ah, Basic Methods in Molecular Biology (1 986)). The host cells containing one of the 
15 polynucleotides of the invention, can be used in conventional manners to produce the gene 
product encoded by tlie isolated fragment (in the case of an ORF) or can be used to produce a 
heterologous protein under the control of the EMF. 

Any host/vector system can be used to express one or more of tlie ORPs of the present 
invention. These mclude, but are not limited to, eukaryodc hosts such as HeLa ceils, Cv-1 cell, 
COS cells, 293 cells, and Sf9 ceils, as well as prokaryotic host such as E. coli and B, subtili:^^^ 
Tlie most preferred cells arc those which do not normally express the particular polypeptide or 
protein or which expresses the polypeptide or protein at low natural level. Mature proteins can 
be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appropriate 
promoters. Cell-free translation systems can also be employed to produce such proteins using 
25 RNAs derived from the DNA constructs of the present invention. Appropriate cloning and 

expression vectors for use with prokaryotic and eukaryotic hosts are described by Sambrook, ct 
ah, in Molecular Clonmg: A Laboratory Manual, Second Edition, Cold Spring Harbor, New 
York (1989), the disclosure o " which is hereby incorporated by reference. 

Various mammalian ceil culture systems can also be employed to express recombinant 
30 protein. Examples of mammahan expression systems include the COS-7 lines of monkey kidney 
fibroblasts, described by Gluzman, Cell 23:175 (1981). Other cell lines capable of expressing a 
compatible vector are, for example, the CI 27, monkey COS cells, Chinese Hamster Ovary 
(CHO) cells, human kidney 293 cells, human epidermal A431 cells, human Colo205 cells, 3T3 
cells, CV-1 cells, other transformed primate ceil lines, normal diploid cells, cell strains derived 
35 from in vitro culture of primary tissue, primary explants, HeLa cells, mouse L cells, BHk! 
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HL-60, U937, HaK or Jiirkat cells. Mammalian expression vectors will comprise an engiu of 
replication, a suitable promoter and also any necessary ribosome binding sites, polyadenylation 
site, splice donor and acceptor sites, transcriptional termination sequences, and 5' flanking 
nontranscribed sequences. DNA sequences derived from the SV40 viral genome, for example, 
5 SV40 origin, early promoter, enhancer, splice, and polyadenylation sites may be used to provide 
the required nontranscribed genetic elements. Recombinant polypeptides and proteins produced 
in bacterial cultiu-e are usually isolated by initial extraction from cell pellets, followed bj one or 
more salting-out, aqueous ion exchange or size exclusion chromatography steps. Proteiti 
refolding steps can be used, as necessary, in completing configuration of the mature protein. 
10 Finally, high performance liquid chromatography (HPLC) cai:i be employed for fmal purification 
steps. Microbial cells employed in expression of proteins can be disrupted by any convenient 
method, including Ireeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing 
agents. 

Alternatively, it may be possible to produce the protein in lower eukaiyotes such as yeast 
1 5 or insects or in prokaryotes such as bacteria. Potentially suitable yeast strains include 

Saccharomyces cerevisiae, Schizosaccharomyces pombe^ Khtyveromyces strains, CandidQy or 
any yeast strain capable of expressing heterologous proteins. Potentially suitable bacterial 
strains include Escherichia coli, Bacillus subtilis. Salmonella typhimurium, or any bacterial 
strain capable of expressing heterologous proteins. If the protein is made in yeast or bacteria, it 
20 may be necessary to modify the protein produced therein, for example by phosphorylation oir 
glycosylation of the appropriate sites, in order to obtain the functional protein. Such covalent 
attachments may be accomplished using known chemical or enzymatic methods. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 
25 inducible regulatory elements, in which case the regulatory sequences of the endogenous gene 

may be replaced by homologous recombination. As described herein, gene targeting can be used 
to replace a gene's existing regulatory region with a regulatory sequence isolated from a different 
gene or a novel regulatory sequence synthesized by genetic engineering methods. Such 
regulatory sequences may be comprised of promoters, enhancers, scaffold-attachment regions, 
30 negative regulatory elements, transcriptional initiation sites, regulatory protein binding sites or 
combinations of said sequences. Alternatively, sequences which affect the structure or stability 
of the RNA or protein produced may be replaced, removed, added, or otherwise modified by 
targeting. These sequence include polyadenylation signals, mRNA stability elements, splice 
sites, leader sequences for enhancing or modifying transport or secretion properties of the 
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protein, or other sequences wliich alter or improve the function or stability of protein 0£ KNA 
molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing tlie 
gene under the control of the new regulatory sequence, e.g., inserting a new promoter or 
5 enliancer or both upstream of a gene. Alternatively, the targeting event may be a simple deletion 
of a regulatory element, such as the deletion of a tissue-specific negative regulatoiy elemebt. 
Alternatively, the targeting event may replace an existing element; for example, a tissue-specific 
enhancer can be replaced by an enhancer that has broader or different cell-type specificity than 
the naturally occurring elements. Here, the naturally occurring sequences are deleted and new 
1 0 sequences are added. In all cases, the identification of the targeting event may be facilitated by 
the use of one or more selectable marker genes that are contiguous with the targeting DNA, 
allowing for the selection of cells in which the exogenovis DNA has integrated into the host cell 
genome. The identification of the targeting event may also be facilitated by the use of one or 
more marker genes exhibiting the property of negative selection, such that the negatively 
1 5 selectable marker is linked to the exogenous DNA, but configured such that the negatively 
selectable marker flanks the targeting sequence, and such that a correct homologous 
recombination event with sequences in the host cell genome does not result in the stable 
integration of the negatively selectable marker. Markers useful for this purpose include the 
Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial xanthine-guanine 
20 phosphoribosyl-transferase (gpt) gene, * 
The gene targeting or gene activation techniques which can be used in accordance with 
this aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to 
Chappel; U.S. Patent No. 5,578,461 to Sherwin et ah; International Application No. 
PCT/US92/09627 (WO93/09222) by Selden et aL; and International Application No, 
25 PCT/US90/06436 (W09 1/06667) by Skoultchi et ah, each of which is incorporated by reference 
herein in its entirety, 

4,6 POLYPEPTIDES OF THE INVENTION 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 
30 comprising: the amino acid sequences set fortli as any one of SEQ ID NO:1787-3572 and 5359- 
7144 or an amino acid sequence encoded by any one of the nucleotide sequences SEQ ID N0:1- 
1786 and 3573-5358 or the corresponding full length or mature protein. Polypeptides of the 
invention also include polypeptides preferably with biological or immunological activity that are 
encoded by: (a) a polynucleotide having any one of the nucleotide sequences set forth in SEQ ID 
35 NO:l-1786 and 3573-5358 or (b) polynucleotides encoding any one of the ammo acid sequences 
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set forth as SEQ ID NO: 1787-3572 and 5359-7144 or (c) polynucleotides that hybridize to the 

complement of the polynucleotides of either (a) or (b) under stringent hybridization conditions. 

The invention also provides biologically active or immunologically active variants of any of the 

amino acid sequences set forth as SEQ ID NO:1787-3572 and 5359-7144 or the corresponding 

5 full length or mature protein; and "substantial equivalents" thereof (e.g., with at least about 

65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least 

about 90%, typically at least about 95%, more typically at least about 98%, or most typic^ly at 

least about 99% amino acid identity) that retain biological activity. Polypeptides encoded by 

allelic variants may have a similar, increased, or decreased activity compared to polypeptides 

10 comprising SEQ ID NO:1787-3572 and 5359-7144. 

Fragments of the proteins of the present invention wliich axe capable of exhibiting 
biological activity are also encompassed by the present invention. Fragments of the protein may 
be in linear form or they may be cyclized using known methods, for example, as described in H. 
U. Saragovi, et aL, Bio/Technology 10, 773-77S (1992) and in R. S. McDowell, et al., J. Amer. 

15 Chcm. Soc. 1 14, 9245-9253 (1992), both of which aie incorporated herein by reference. Such 
fragments may be fused to carrier molecules such as immunoglobulins for many purposes, 
including increasing the valency of protein binding sites. 

The present invention also provides both full-length and mature forms (for example, 
v/ithout a signal sequence or precursor sequence) of the disclosed proteins. The protein coding 

20 sequence is identified in the sequei\ce listing by translation of tlie disclosed nucleotide - - 
sequences. The mature form of such protein may be obtained by expression of a full-length 
polynucleotide in a suitable mammalian cell or other host cell. The sequence of the mature form 
of the protein is also determinable from the amino acid sequence of the frill-length form. Where 
proteins of the present invention are membrane bound, soluble forms of the proteins are also 

25 provided. In such forms, part or all of the regions causing the proteins to be membrane bound 
are deleted so that the proteins are fully secreted from the cell in which they are expressed. 

Protein compositions of the present invention may fiirther comprise an acceptable carrier, 
such as a hydrophilic, e.g., phannaceutically acceptable, can'ier. 

The present invention further provides isolated polypeptides encoded by the nucleic acid 

30 fragments of the present invention or by degenerate variants of the nucleic acid fragments of the 
present invention. By "degenerate variant" is intended nucleotide fragments which differ from a 
nucleic acid fragment of the present invention (e.g., an OR.F) by nucleotide sequence but, due to 
the degeneracy of tlie genetic code, encode an identical polypeptide sequence. Preferred nucleic 
acid fragments of the present invention ai-e the ORPs that encode proteins. 
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A variety of methodologies known in the ait can be utilized to obtain any one of the 
isolated polypeptides or proteins of the present invention. At the simplest level, the amino acid 
sequence can be syntJiesized using commercially available peptide synthesizers. The 
syntiieticaliy-constmcted protein sequences, by virtue of sharing primary, secondary or tertiary 
5 structural and/or conformational characteristics with proteins may possess biological properties 
in common therewith, including protein activity. This technique is particularly useful in 
producing small peptides and fragments of larger polypeptides, Fragnaents are useful, for 
example, in generating antibodies against the native polypeptide. Thus, they may be employed 
as biologically active or immunological substitutes for natural, purified proteins in screening of 

iO therapeutic compounds and in immunological processes for the development of antibodies. 

The polypeptides and proteins of the present invention can alternatively be purified from 
cells which have been altered to express the desired polypeptide or protein. As used herein, a 
cell is said to be altered to express a desired polypeptide or protein when the cell, through genetic 
manipulation, is made to produce a polypeptide or protein which it normally does not produce or 

15 which the cell normally produces at a lower level. One skilled in the art can readily adapt 
procedures for introducing and expressing either recombinant or synthetic sequences into 
eukaryotic or prokaryotic cells in order to generate a cell which produces one of the polypeptides 
or proteins of the present invention. 

The invention also relates to methods for producing a polypeptide comprising growing a 

20 cultLire of host cells of the invention in a suitable culture medium, and purifying the protein ^om 
the cells or the culture in which the cells are grown. For exampl^^, the methods of the invention 
include a process for producing a polypeptide in which a host cell containing a suitable 
expression vector that includes a polynucleotide of the invention is cultured luider conditions that 
allow expression of the encoded polypeptide. The polypeptide can be recovered from the 

25 culture, conveniently from the culture medium, or from a lysate prepared from the host cells and 
further purified. Preferred embodiments include those in which the protein produced by such 
process is a full length or mature form of the protein. 

In an alternative method, the polypeptide or protein is purified from bacterial cells which 
naturally produce the polypeptide or protein. One skilled in the art can readily follow known 

30 methods for isolating polypeptides and proteins in order to obtain one of the isolated 
polypeptides or proteins of the present invention. These include, but are not limited to, 
immunochromatography, HPLC, size-exclusion chromatography, ion-exchange chromatography, 
and inamuno-afSnity chromatography. See, e.g.. Scopes, Protein Purification: Principles and 
Practice, Springer-Verlag (1994); Sambrook, et aL, in Molecular Cloning: A Laboratoy 

35 Manual, Ausubel et al. Current Protocols in Molecular Biology, Polypeptide fragments that 
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retain biological/immunological activity include fragments comprising gi-eater than about 100 
amino acids, or greater than about 200 amino acids, and fragments that encode specific protein 
domains. 

The purified polypeptides can be used in in vitro binding assays which are well known in 
5 the ait to identify molecules which bind to the polypeptides. These molecules include but are not 
limited to, for e.g., small molecules, molecules from combinatorial libraries, antibodies or other 
proteins. The molecules identified in the binding assay are then tested for antagonist or^agonist 
activity in in vivo tissue cul^'ire or animal models that are well known in the art. In brief, the 
molecules are titrated into a plurality of cell cultures or animals and then tested for either 

10 cell/animal death or prolonged survival of the animal/cells. 

In addition, the peptides of the invention or molecules capable of binding to the peptides 
may be completed with toxins, e.g., ricin or cholera, or with other compounds that are toxic to 
cells. The toxin-binding molecule complex is then targeted to a tumor or other cell by the 
specificity of the binding molecule for SEQ ID NO:1787-3572 and 5359-7144. 

1 5 The protein of the invention may also be expressed as a product of transgenic animals, 

e.g., as a component of the milk of transgenic cows, goats, pigs, or sheep which are characterized 
by somatic or germ cells containing a nucleotide sequence encoding the protein. 

The proteins provided herein also include proteins characterized by amino acid sequences 
similar to those of purified proteins but into which modification are naturally provided or 

20 deliberately engineered. For example, modifications, in the peptide or DNA sequence, can be 
made by those skilled in the art using known techniques. Modifications of interest in the protein 
sequences may include the alteration, substitution, replacement, insertion or deletion of a 
selected amino acid residue in the coding sequence. For example, one or more of the cysteine 
residues may be deleted or replaced with another amino acid to alter the conformation of the 

25 molecule. Techniques for such alteration, substitution, replacement, insertion or deletion are 
well known to those skilled in the art (see, e.g., U.S. Pat. No. 4,51 8,584), Preferably, such 
alteration, substitution, replacement, insertion or deletion retains the desired activity of the 
protein. Regions of the protein that are important for the protein function can be determined by 
various methods known in the art including the alanine-scanning method which involved 

30 systematic substitution of single or strings of amino acids with alanine, followed by testing the 
resulting alanine-containing variant for biological activity. This type of analysis determines the 
importance of the substituted amino acid(s) in biological activity. Regions of the protein that are 
important for protein function may be determined by the eMATRJX program. 

Other fragments and derivatives of the sequences of proteins which would be expected to 

35 retain protein activity in whole or in part and are useful for screening or other immunological 
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methodologies may also be easily made by those skilled in the art given the disclosures herein- 
Such modifications are encompassed by the present invention. 

The protein may also be produced by operably linking the isolated polynucleotide of the 
invention to suitable conU'ol sequences in one or more insect expression vectors, and employing 
an insect expression system. Materials and methods for baculovirus/insect ceil expression 
systems are commercially available in kit form from, eg., Invitrogen, San Diego, Calif, U.S.A. 
(the MaxBat^*^! kit), and such methods are well known in Hie art, as described in Summers and 
Smith, Texas Agricultural Experiment Station Bulletin No. 1555 (1987), incorporated herein by 
reference. As used herein, an insect cell capable of expressing a polynucleotide of the present 
invention is "transformed." 

The protein of the invention may be prepared by culturing transformed host cells under 
culture conditions suitable to express the recombinant protein. The resulting expressed protein 
may then be purified from such culture (/.e., from culture medium or cell extracts) using known 
purification processes, such as gel filtration and ion exchange chromatography. The purification 
of the protein may also include an affinity column containing agents which will bind to the 
protein; one or more column steps over such affinity resins as concanavalin A-agarose, 
heparin-toyopearFM or Cibacrom blue 3GA Sepharose^M. ^^e or more steps involving 
hydrophobic interaction chromatography using such resins as phenyl ether, butyl etlier, or propyl 
ether; or immunoaffinity chromatography. 

Alternatively, the protein of the invention may also be expressed in a form which wi^^ ' 
facilitate purification. For example, it may be expressed as a fusion protein, such as those of 
maltose binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin (TRX), or as a 
His tag. Kits for expression and purification of such fusion proteins are commercially available 
from New England BioLab (Beverly, Mass.). Pharmacia (Piscataway, N. J.) and Invitrogen, 
respectively. The protein can also be tagged with an epitope and subsequently purified by using 
a specific antibody directed to such epitope. One such epitope ("FLAG®") is commercially 
available from Kodalc (New Haven, Conn,). 

Finally, one or more reverse-phase high performance liquid chromatography (RP- HPLC) 
steps employing hydrophobic RP-HPLC media, e.g, , silica gel having pendant methyl or other 
aliphatic groups, can be employed to further purify the protein. Some or all of the foregoing 
purification steps, in various combinations, can also be employed to provide a substantially 
homogeneous isolated recombinant protein. The protein thus purified is substantially free of 
other mammalian proteins and is defined in accordance with the present invention as an "isolated 
protein." 
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The polypeptides of tiie invention include analogs (variants). This embraces fragments, 
as well as peptides in which one or more amino acids has been deleted, inserted, or substituted. 
Also, analogs of tlie polypeptides of the invention embrace fusions of the polypeptides or 
modifications of the polypeptides of the invention, wherein the polypeptide or analog is fused to 
5 another moiety or moieties, e.g., targeting moiety or another therapeutic agent. Such analogs 
may exhibit improved properties such as activity and/or stability. Examples of moieties which 
may be fused to the polypeptide or an analog include, for example, targeting moieties which 
provide for the delivery of polypeptide to pancreatic cells, e.g., antibodies to pancreatic cells, 
antibodies to immune cells such as T-cells, monocytes, dendritic ceils, granulocytes, etc., as well 
1 0 as receptor and ligands expressed on pancreatic or immune cells. Other moieties which may be 
fused to the polypeptide include therapeutic agents which are used for treatment, for example, 
immunosuppressive drugs such as cyclosporin, SK506, azathioprine, CD3 antibodies and 
steroids. Also, polypeptides may be fused to immune modulators, and other cytokines such as 
alpha or beta interferon. 

15 

4.6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOTIDE IDENTITY 
AND SIMILARITY 

Preferred identity and/or similarity are designed to give the largest match between the 
sequences tested. Methods to determine identity and similarity are codified in computer 
20 programs including, but are not limited to, tlie GCG program package, including GAP 

(Devereux, J., et al., Nucleic Acids Research 12(1):387 (1984); Genetics Computer Group, 
University of Wisconsin, Madison, WI), BLASTP, BLASTO, BLASTX, FASTA (Aitschul, S.F. 
et aL, J. Molec, Biol. 215:403-410 (1990), PSI-BLAST (Aitschul S.F. et al., Nucleic Acids Res. 
vol. 25, pp. 3389-3402, herein incorporated by reference), eMatrix software (Wu et aL, J. Comp. 
25 Biol., Vol 6, pp. 219-235 (1999), herein incorporated by reference), eMotif software (Nevill- 

Manning et al, ISMB-97, Vol. 4, pp, 202-209, herein incorporated by reference), pFam software 
(Sonnhammer et al.. Nucleic Acids Res., Vol. 26(1), pp. 320-322 (1998), herein incorporated by 
reference) and the Kyte-Doolittle hydrophobocity prediction algorithm (J. Mol Biol, 157, pp. 
105-31 (1982), incorporated herein by reference). The BLAST programs are publicly available 
30 from the National Center for Biotechnology Information (NCBI) and other sources (BLAST 
Manual, Aitschul, S., et ah NCB NLM Nffl Bethesda, MD 20894; Aitschul, S., et al., J. MoL 
Biol. 215:403-410(1990). 

4.7 CHIMERIC AND FUSION PROTEINS 

The invention also provides chimeric or fusion proteins. As used herein, a "chin^ric 

35 protein" or "fusion protein" comprises a polypeptide of the invention operatively linked to 
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another polypeptide. Within a fusion protein the polypeptide according to the invention ean 
correspond to all or a portion of a protein according to the invention. In one embodiment, a 
fusion protein comprises at least one biologically active portion of a protein according to the 
invention. In another embodiment, a fusion protein comprises at least two biologically active 
5 portions of a protein according to the invention. Within the fusion protein, the terra "operativcly 
linked" is intended to indicate that the polypeptide according to the invention and the other 
polypeptide are fused in-frame to each other. The polypeptide can be fused to the N-tei-minus or 
C-terminus. 

For example, in one embodiment a fusion protein comprises a polypeptide according to 

10 the invention operably linked to the extracellular domain of a second protein. 

In another embodiment, the fusion protein is a GST-fusion protein in which the polypeptide 
sequences of tlie invention are fused to the C-terminus of the GST (i.e., glutathione 
S-transferase) sequences. 

In another embodiment, the fusion protein is an immunoglobulin fusion protein in which 

15 the polypeptide sequences according to the invention comprises one or more domains are fused 
to sequences derived from a member of the immunoglobulin protein family. The 
immunoglobulin fusion proteins of the invention can be incorporated into pharmaceutical 
compositions and administered to a subject to inhibit an interaction between a ligand and a 
protein of the invention on the surface of a cell, to thereby suppress signal transduction in vivo, 

20 The iromunoglobulin fusion proteins can be used to affect the bioavailability of a cognate ligahd. 
Inhibition of the ligand/protein interaction may be useful therapeutically for both the treatment of 
proliferative and differentiative disorders, e,g., cancer as well as modulating (e,g., promoting or 
inhibiting) cell survival. Moreover, the immunoglobulin fusion proteins of the invention can be 
used as immunogens to produce antibodies in a subject, to purify ligands, and in screening assays 

25 to identify molecules that inhibit the interaction of a polypeptide of the invention with a ligand. ' 
A chimeric or fusion protein of the invention can be produced by standard recombinant 
DNA techniques. For example, DNA fragments coding for the different polypeptide sequences 
are ligated together in-frame in accordance with conventional techniques, e.g., by employing 
blunt-ended or stagger-ended termini for ligation, restriction enzyme digestion to provide for 

30 appropriate termini, filling-in of cohesive ends as appropriate, alkaUne phosphatase treatment to 
avoid imdesirable joining, and enzymatic ligation. In another embodiment, the fusion gene can 
be synthesized by conventional techniques including automated DNA synthesizers. 
Alternatively, PGR amplification of gene fragments can be carried out using anchor primers that 
give rise to complementary overhangs between two consecutive gene fragments that can 

35 subsequently be annealed and reamplified to generate a chimeric gene sequence (see, for 



SNSDOCID: <WO 0153312A1J_> 



wo ((1/53312 PCT/l)S(K>/342r>3 

example, Ausubei et aL (eds.) Current Protocols in Molecular Biology, John Wiley 8c 
Sons, 1992). Moreover, many expression vectors are commercially available that already encode 
a fusion moiety {e,g,, a GST polypeptide). A nucleic acid encoding a polypeptide of the 
invention can he cloned into such an expression vector such that the fusion moiety is linked 
5 in- frame to the protein of the invention. 

4.8 GENE THERAPY 

Mutations in the polynucleotides of the invention gene may result in loss of normal 
function of the encoded protein. The invention thus provides gene therapy to restore normal 

1 0 activity of the polypeptides of the invention; or to treat disease states involving polypeptides of 
the invention. Delivery of a functional gene encoding polypeptides of the invention to 
appropriate cells is effected ex vivo, in situ, or in vivo by use of vectors, and more particularly 
viral vectors (e.g., adenovirus, adeno-associated virus, or a retrovirus), or ex vivo by use of 
physical DNA transfer methods (e.g., liposomes or chemical treatments). See, for example, 

15 Anderson, Nature, supplement to vol. 392, no. 6679, pp.25-20 (1998). For additional reviews of 
gene therapy technology see Friedmann, Science, 244: 1275-1281 (1989); Verraa, Scientific 
American: 68-84 (1990); and Miller, Nature, 357: 455-460 (1992). Introduction of any one of 
the nucleotides of the present invention or a gene encoding the polypeptides of the present 
invention can also be accomplished with extrachromosomal substrates (transient expression) or 

20 artificial cliromosomes (stable expression). Cells may also be cultured ex vivo in the preserve of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 
Alternatively, it is contemplated that in other human disease states, preventing the expression of 
or inhibiting the activity of polypeptides of the invention will be useful in treating the disease 

25 states. It is contemplated that antisense therapy or gene therapy could be applied to negatively 
regulate the expression of polypeptides of the invention. 

Other methods inhibiting expression of a protein include the introduction of antisense 
molecules to the nucleic acids of the present invention, their complements, or their translated IINA 
sequences, by methods known in the art. Further, the polypeptides of the present invention can be 

3 0 inhibited by using targeted deletion methods, or the insertion of a negative regulatory element such 

as a silencer, which is tissue specific. 

The present invention still further provides ceils genetically engineered in vivo to express the 

polynucleotides of the invention, wherein such polynucleotides are in operative association with a 

regulatory sequence heterologous to the host cell which drives expression of the polynucleotides in 

f 
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the cell. ThesQ methods can be used to increase or decrease the expression of the polynucleotides of 
the present inventioa 

Knowledge of DNA sequences provided by the invention allows for moditlcation of cells to 
perniit, increase, or decrease, expression of endogenous polypeptide. Cells can be modified (e.g., by 
5 homologous recombination) to provide increased polypeptide expression by replacing, in whole or 
in pai-t, the naturally occurring promoter with all or part of a heterologous promoter so that the cells 
express the protein at higher levels. The heterologous promoter is inserted in such a manner that it is 
operati vely linked to the desired protein encoding sequences. See, for example, PCX International 
Publication No. WO 94/126:)0, PCT International Publication No, WO 92/20808, and PCX 
10 InternationaiPublicationNo, WO 91/09955, It is also contemplated that, in addition to heterologous 
promoter DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase,and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the desired 
protein coding sequence, amplification of the marker DNA by standard selection methods results in 
15 co-amplificationof the desired protein coding sequences in the cells. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 
mducible regulatory elements, in which case the regulatory sequences of the endogenous gene may 
be replaced by homologous recombination. As described herein, gene targeting can be used to 
replace a gene's existing regulatory region with a regulatory sequence isolated from a different^ene 
or a novel regulatory sequence synthesized by genetic engineering methods. Such regulatory 
sequences may be comprised of promoters, enliancers, scaffold-attachment regions, negative 
regulatory elements, transcriptional initiation sites, regulatory protein binding sites or combinations 
of said sequences- Alternatively, sequences which affect the structure or stability of the RNA or 
25 protein produced may be replaced, removed, added, or otherwise modified by targeting. Xhese 

sequences include polyadenylation signals, mRNA stability elements, splice sites, leader sequences 
for enhancing or modifying transport or secretion properties of the protein, or other sequences 
which alter or improve the function or stability of protein or RNA molecules. 

Xhe targeting event may be a simple insertion of the regulatory sequence, placing the gene 
30 under the control of the new regulatory sequence, e.g., inserting a new promoter or enhancer or both 
upstream of a gene. Alternatively, the targeting event may be a simple deletion of a regulatory 
element, such as tlie deletion of a tissue-specific negative regulatory element. Alternatively, the 
targeting event may replace an existing element; for example; a tissue-specific enhancer can be 
replaced by an enhancer that has broader or different cell-type specificity than the naturally 
35 occurring elements. Here, the naturally occurring sequences are deleted and new sequences Ire 

35 
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added. In all cases, tlie identification of the tai-geting event may be faciiitated by the use^of one or 
more selectable marker genes that are contiguous with the targeting DNA, allowing for the selection 
of cells in which the exogenous DNA has integrated into the cell genome. The identification of the 
targeting event may also be facilitated by the use of one or more marker genes exhibiting the 
5 property of negative selection, such that tlie negatively selectable marker is linked to the exogenous 
DNA, but configured such that the negatively selectable mai ker flanks the targetmg sequence, and 
such that a correct homologous recombination event with sequences in the host ceil genome does 
not result in the stable integration of the negatively selectable marker. Markers useful for this 
purpose include tlie Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial 

1 0 xanthine-guanine phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with this 
aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to Chappel; 
U.S. Patent No. 5,578,461 to Sherwdn et aL; International Application No. PCTAJS92/09627 
(WO93/09222)by Seidenet al.; and International Application No. PCT/US 90/0643 6 

15 (W09 1/06667) by Skoultchiet aL, each of which is incorporated by reference herein in its entirety. 

4.9 TRANSGENIC ANIMALS 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 

20 inactivated in the germ line of animals using homologous recombination [Capecchi, Science ^ 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
refeired to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 

25 prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that moaulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 

30 Publication No. W094/28 122, incorporated herem by reference. 

Transgenic animals can be prepared wherein all or part of a promoter of the 
polynucleotides of the invention is either activated or inactivated to alter the level of expression 
of the polypeptides of the invention. Inactivation can be carried out using homologous 
recombination methods described above. Activation can be achieved by supplementing or even 

35 replacing the homologous promoter to provide for increased protein expression. The homologous 
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promoter can be supplemented by insertion of one or more heterologous enhancer elements 
known to confer promoter activation in a particular tissue. 

The polynucleotides of the present invention also make possible the development, 
tlirough, e.g., homologous recombination or Icnock out strategies, of animals that fail to express 
5 polypeptides of the invention or that express a variant polypeptide. Such animals are useflil as 
models for studying the in vivo activities of polypeptide as well as for studying modulators of the 
polypeptides of the invention. 

In preferred methods to determine biological functions of the polypeptides of the" 
invention in vivo, one or more genes provided by tine invention are either over expressed or 

1 0 inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 

1 5 prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 

20 Publication No, W094/28 1 22, incorporated herein by reference. ^ 
Transgenic anirtials can be prepared wherein all or part of the polynucleotides of the 
invention promoter is either activated or inactivated to alter the level of expression of the 
polypeptides of the invention. Inactivation can be carried out using homologous recombination 
methods described above. Activation can be achieved by supplementing or even replacing the 

25 homologous promoter to provide for increased protein expression. The homologous promoter 
can be supplemented by msertion of one or more heterologous enhancer elements known to 
confer promoter activation in a particular tissue. 

4.10 USES AND BIOLOGICAL ACTIVITY 

30 The polynucleotides and proteins of the present invention are expected to exhibit one or 

more of the uses or biological activities (including those associated with assays cited herein) 
identified herein. Uses or activities described for proteins of the present invention may be 
provided by administration or use of such proteins or of polynucleotides encoding such proteins 
(such as, for example, in gene therapies or vectors suitable for introduction of DNA). The 

35 mechanism underlying the particular condition or pathology will dictate whether the 
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polypeptides of the invention, the polynucleotides of the invention or modulators (activators or 
inliibitors) thereof xvould be beneficial to the subject in need of treatment. Thus, "therapeutic 
compositions of the invention" include compositions comprising isolated polynucleotides 
(including recombinant DNA molecules, cloned genes and degenerate variants thereof) or 
polypeptides of the invention (including full length protein, mature protein and truncations or 
domains thereof), or compounds and other substances that modulate the overall activity of the 
target gene products, either at the level of target gene/protein expression or target protein, 
activity. Such modulators include polypeptides, analogs, (variants), including fragments and 
fusion proteins, antibodies and other binding proteins; chemical compounds that directly or 
indirectly activate or inhibit the polypeptides of the invention (identified, e.g., via drug screening 
assays as described herein); antisense polynucleotides and polynucleotides suitable for triple 
helix formation; and in particular antibodies or other binding partners that specifically recognize 
one or more epitopes of the polypeptides of the invention. 

The polypeptides of the present invention may likewise be involved in cellular activation 
or in one of the other physiological pathways described herein. 

4.10.1 RESEARCH USES AND UTILITIES 

The polynucleotides provided by the present invention can be used by the research 
community for various purposes. 'Ilie polynucleotides can be used to express recombinant 
protein for analysis, characterization or therapeutic use; as markers for tissues in which the 
corresponding protein is preferentially expressed (either constitutively or at a particular stage of 
tissue differentiation or development or in disease states); as molecular weight markers on gels; 
as chromosome markers or tags (when labeled) to identify chromosomes or to map related gene 
positions; to compare with endogenous DNA sequences in patients to identify potential genetic 
disorders; as probes to hybridize and thus discover novel, related DNA sequences; as a source of 
information to derive PGR primers for genetic fingerprinting; as a probe to "subtract-out" known 
sequences in the process of discovering other novel polynucleotides; for selecting and making 
oligomers for attachment to a "gene chip" or other support, including for examination of 
expression patterns; to raise anti-protein antibodies using DNA immunization tecliniques; and as 
an antigen to raise anti-DNA antibodies or elicit another immune response. Where the 
polynucleotide encodes a protein which binds or potentially binds to another protein (such as. for 
example, in a receptor-ligand interaction), the polynucleotide can also be used in interaction trap 
assays (such as, for example, that described in Gyuris et al.. Cell 75:791-803 (1993)) to identify 
polynucleotides encoding the other protein with which binding occurs or to identify inhibitors of 
the binding interaction. 
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The polypeptides provided by the present invention can similarly be used in assays to 
determine biological activity, including in a panel of naultiple proteins for high-throughput 
screening; to raise antibodies or to elicit another immune response; as a reagent (including the 
labeled reagent) in assays designed to quantitatively determine levels of the protein (or its 
5 receptor) in biological fluids; as markers for tissues in which the corresponding polypeptide is 
preferentially expressed (either constitutively or at a particular stage of tissue differentiation or 
development or in a disease state); and, of course, to isolate correlative receptors or ligands. 
Proteins involved in these binding interactions can also be used to screen for peptide or small 
molecule inhibitors or agonists of the binding interaction. 
^ ^ Any or all of these research utilities are capable of being developed into reagent grade or 

kit format for commercialization as research products. 

Methods for performing the uses listed above are well known to those skilled in the art. 
References disclosing such methods include without limitation "Molecular Cloning: A 
Laboratory Manual", 2d ed.. Cold Spring Harbor Laboratory Press, Sambrook, J., E, F, Fritsch 
1 5 and T. Maniatis eds., 1989, and "Methods in Enzymology: Guide to Molecular Cloning 
Techniques", Academic Press, Berger, S. L. and A. R. Kimniel eds., 1987. 

4.10.2 NUTRITIONAL USES 

Polynucleotides and polypeptides of the present invention can also be used as nutritional 
20 sources or supplements. Such uses include without limitation use as a protein or amino acid ?v 

supplement, use as a carbon source, use as a nitrogen source and use as a source of carbohydrate. In 
such cases the polypeptide or polynucleotide of the invention can be added to the feed of a 
paiticular organism or can be administered as a separate solid or liquid preparation, such as in the 
form of powder, pills, solutions, suspensions or capsules. In the case of microorganisms, the 
25 polypeptide or polynucleotide of the invention can be added to the medium in or on which the 
microorganism is cultured. 

4.10.3 CYTOKINE AND CELL PROLIFERATION/DIFFERENTIATION 
ACTIVITY 

A polypeptide of the present invention may exhibit activity relating to cytokine, cell 
proliferation (either inducing or inhibiting) or cell differentiation (either inducing or inhibiting) 
activity or may induce production of other cytokines in certain cell populations. A 
polynucleotide of the invention can encode a polypeptide exhibiting such attributes. Many 
protein factors discovered to date, including all known cytokines, have exhibited activity in one 
35 or more factor-dependent cell proliferation assays, and hence the assays serve as a convenient 
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confirmation of cytokine activity. The activity of therapeutic compositions of the present 
invention is evidenced by any one of a number of routine factor dependent cell proliferation 
assays for cell lines including, without limitation, 32D, DA2, DAIG, TIO, B9, B9/il, BaF3, 
MC9/G, M+(preB M+), 2E8, RB5, DAI, 123, Tl 165, HT2, CTLL2, TF-1, Mo7e, CMK, 
HUVEC, and Caco. Therapeutic compositions of the invention can be used in the following- 
Assays for T-cell or thymocyte proliferation include without limitation those described 
in: Current Protocols in Immunology, Ed by I E. Coiigan, A. Kruisbeek, D. H. Margulies, E. 
M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3T-3.19; Chapter 7, Immunologic studies in 
Humans); Takai et al., J. Immunol. 137:3494-3500, \9S6; Bertagnolli et al., J. Immunol. 
145:1706-1712, 1990; Bertagnolli et al.. Cellular Immunology 133:327-341, 1991; Bertagnolli, 
et aL, 1. Immunol. 149:3778-3783, 1992; Bovman et al., L Immunol. 152:1756-1761, 1994. 

Assays for cytokine production and/or proliferation of spleen cells, lymph node cells or 
thymocytes include, without limitation, those described in: Polyclonal T cell stimulation, 
Kruisbeek, A. M. and Shevach, E. M, In Current Protocols in Inmiunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto, 1994; and Measurement of mouse 
and human interleukin-y, Schreiber, R. D. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. e.^A-6.%.Z, John Wiley and Sons, Toronto. 1994. 

Assays for proliferation and differentiation of hematopoietic and lymphopoietic cells 
include, without limitation, those described in: Measurement of Human and Murine Interlculcin 2 
and Interleukin 4, BoUomly, K., Davis, L. S. and Lipsky, P, E. In Current Protocols in 
Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.3.1-6.3.12, John Wiley and Sons, Toronto. 1991; 
dcVries et ah, J. Exp. Med. 173:1205-1211, 1991; Moreau et al„ Nature 336:690-692, 1988; 
Greenberger et al., Proc. Natl. Acad. Sci. U.S.A. 80:2931-^2938, 1983; Measurement of mouse 
and human interleukin 6-Nordan, R. In Current Protocols in Immunology. J. E. Coligan eds. Vol 
I pp. 6.6,1-6.6.5, John Wiley and Sons, Toronto. 1991; Smith et al., Proc. Natl, Aced. Sci. 
U.S.A. 83:1857-1861, 1986; Measurement of human Interleukin 11-Bennett, F., Giannotti, J., 
Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. J. E. Coligan eds. Vol 1 pp. 
6.15,1 John Wiley and Sons, Toronto. 1991; Measurement of mouse and human Interleukin 
9-CiarleUa, A„ Giannotti, J., Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. 
J, E. Coligan eds. Vol 1 pp. 6.13.1, John Wiley and Sons, Toronto. 1991. 

Assays for T-cell clone responses to antigens (which will identify, among others, proteins 
that affect APC-T cell interactions as well as direct T-cell effects by measuring proliferation and 
cytokine production) include, without limitation, those described in: Current Protocols in 
Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. Shevach, W^trober, 

40 



A 



wo 01/5J312 PCT/USOO/34263 
Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, In Vitro assays for Mouse 
Lymphocyte Function; Chapter 6, Cytokines and their cellular receptors; Chapter 1, 
Immunologic studies in Humans); Weinberger et al.,,Proc. Natl. Acad. Sci. USA 77:6091-6095, 
1 980; Weinberger et al., Eur. J. Immun. 1 1 :405-4 11 , 1 98 1 ; Takai et al., J. Immunol, 
137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 1988. 



4.10,4 STEM CELL GROWTH FACTOR ACTIVITY 

A polypeptide of the present invention may exhibit stem cell growth factor activity and 
be involved in the proliferation, differentiation and survival of pluripotent and totipotent stem 
ceils including primordial germ ceils, embryonic stem cells, hematopoietic stem cells and/or 
germ line stem cells. Administration of the polypeptide of the invention to stem ceils in vivo or 
ex vivo is expected to maintain and expand cell populations in a totipotential or pluripotential 
state which would be useful for re-engineering damaged or diseased tissues, transplantation, 
manufacture of bio-pharmaceuticals and the development of bio-sensors. The ability to produce 
large quantities of human cells has important working applications for the production of human 
proteins which currently must be obtained from non-human sources or donors, implantation of 
cells to treat diseases such as Pai'kinson's, Alzheimer's and other neurodegenerative diseases; 
tissues for grafting such as bone marrow, skin, cartilage, tendons, bone, muscle (including 
cardiac muscle), blood vessels, cornea, neural cells, gastrointestinal cells and others; and organs 
for transplantation such as kidney, liver, pancreas (including islet cells), heart and lung. ^ ' 

It is contemplated that multiple different exogenous growth factors and/or cytokines may 
be administered in combination with the polypeptide of the invention to achieve the desired 
effect, including any of the growth factors listed herein, other stem cell maintenance factors, and 
specifically including stem cell factor (SCF), leukemia inhibitory factor (LIF), Flt'3 ligand (Flt- 
3L), any of the interleukms, recombinant soluble IL-6 receptor fused to IL-6, macrophage 
inflammatory protein 1-alpha (MlP-1 -alpha), G-CSF, GM-CSF, thrombopoietin (TPO), platelet 
factor 4 (PF-4), platelet-derived growth factor (PDGF), neural growth factors and basic fibroblast 
growth factor (bFGF), 

Since totipotent stem cells can give rise to virtually any mature cell type, expansion of 
these cells in culture will facilitate the production of large quantities of mature cells. Techniques 
for culturing stem cells are known in the art and administration of polypeptides of the invention, 
optionally with other growth factors and/or cytokines, is expected to enhance the survival and 
proliferation of the stem cell populations. This can be accomplished by direct administration of 
the polypeptide of the invention to the culture medium. Alternatively, stroma cells transfected 
with a polynucleotide that encodes for the polypeptide of the invention can be used as a fleder 
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layer for the stem cell populations in culture or in vivo. Stromal support cells for feeder layers 
may include embryonic bone marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or 
cultured embryonic fibroblasts (see U.S. Patent No. 5,690,926). 

Stem cells themselves can be transfected with a polynucleotide of the invention to induce 
5 autocrine expression of the polypeptide of the invention. This will allow for generation of 

undifferentiated totipotential/pluripotential stem cell lines tliat are useful as is or tlmt can then be 
differentiated into tlie desired mature cell types. These stable cell lines can also serve as^ source 
of undifferentiated totipotential/pluripotential mlWA to create cDNA libraries and templates for 
polymerase chain reaction experiments. These studies would allow for the isolation and 
1 0 identification of differentially expressed genes in stem cell populations that regulate stem cell 
proliferation and/or maintenance. 

Expansion and maintenance of totipotent stem cell populations will be useful in the 
treatment of many pathological conditions. For example, polypeptides of the present invention 
may be used to manipulate stem celis in culture to give rise to neuroepithelial cells that can be 
1 5 used to augment or replace ceils damaged by illness, autoimmune disease, accidental damage or 
genetic disorders. The polypeptide of the invention may be useful for inducing the proliferation 
of neural cells and for the regeneration of nerve and brain Ussue, i.e. for the treatment of central 
and peripheral nervous system diseases and neuropatliies, as well as mechanical and traumatic 
disorders which involve degeneration, death or trauma to neural cells or nerve tissue. In addition, 
20 the expanded stem cell populations can also be genetically altered for gene therapy purposes^nd 
to decrease host rejection of replacement tissues after grafting or implantation. 

Expression of the polypeptide of the invention and its effect on stem ceils can also be 
manipulated to achieve controlled differentiation of the stem cells into more differentiated cell 
types. A broadly applicable method of obtaining pure populations of a specific differentiated 
25 cell type from undifferentiated stem cell populations involves the use of a cell-type specific 

pronioter driving a selectable marker. The selectable marker allows only cells of the desired type 
to survive. For example, stem cells can be induced to differentiate into cardiomyocytes (Wobus 
et al. Differentiation, 48: 173-182, (1991); Kiug et al., J. Clin. Invest., 98(1): 216-224, (1998)) 
or skeletal muscle cells (Browder, L. W. In: Principles of Tissue Engineering eds. Lanza et al., 
3 0 Academic Press ( 1 997)). Alternatively, directed differentiation of stem cells can be 

accomplished by culturing the stem cells in the presence of a differentiation factor such as 
retinoic acid and an antagonist of the polypeptide of the invention which would inhibit the 
effects of endogenous stem cell factor activity and allow differentiation to proceed. 

In vitro cultures of stem cells can be used to determine if the polypeptide of the invention 
35 exhibits stem cell growth factor activity. Stem cells are isolated from any one of various cell 
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sources (including hematopoietic stem ceils and embryonic stem cells) and cultured on a feeder 
layer, as described by Thompson et ah Proc. Natl. Acad, Sci, U.S.A., 92: 7844-7848 (1995), in 
the presence of the polypeptide of the invention alone or in combination with other growth 
factors or cytokines. The ability of the polypeptide of the invention to induce stem cells 
5 proliferation is determined by colony formation on semi-solid support e.g. as described by 
Bernstein et al., Blood, 77: 2316-2321 (1991). 

4.10.5 HEMATOPOIESIS REGULATING ACTIVITY ^ 

A polypeptide of the present invention may be involved in regulation of hematopoiesis 

1 0 and, consequently, in the treatment of myeloid or lymphoid cell disorders. Even marginal 

biological activity in support of colony forming cells or of factor-dependent cell lines indicates 
involvement in regulating hematopoiesis, e.g. in supporting the growth and proliferation of 
erythroid progenitor cells alone or in combination with other cytokines, thereby indicating utility, 
for example, in treating various anemias or for use in conjunction with irradiation/chemotherapy 

1 5 to stimulate the production of erythroid precursors and/or erythroid cells; in supporting the 

growth and proliferation of myeloid cells such as granulocytes and monocytes/macrophages (i.e., 
traditional CSF activity) useful, for example, in conjunction with chemotherapy to prevent or 
treat consequent myelo-suppression; in supporting tiie growth and proliferation of 
megakaryocytes and consequently of platelets thereby allowing prevention or treatment of 

20 various platelet disorders such as thrombocytopenia, and generally for use in place of or 

complimentary to platelet trarisfusions; and/or in supporting the growth and proliferation of 
hematopoietic stem cells which are capable of maturing to any and all of the above-mentioned 
hematopoietic cells and therefore find therapeutic utility in various stem cell disorders (such as 
those usually treated with transplantation, including, without limitation, aplastic anemia and 

25 paroxysmal nocturnal hemoglobinuria), as well as in repopulating the stem cell compartment 
post irradiation/chemotherapy, either in-vivo or ex-vivo (i.e., in conjunction with bone marrow 
transplantation or with peripheral progenitor cell transplantation (homologous or heterologous)) 
as normal cells or genetically manipulated for gene therapy. 

Therapeutic compositions of the invention can be used in the following: 

30 Suitable assays for proliferation and differentiation of various hematopoietic lines are 

cited above. 

Assays for embryonic stem cell differentiation (which will identify, among others, 
proteins that influence embryonic differentiation hematopoiesis) include, without limitation, 
those described in: Johansson et al. Cellular Biology 15:141-151, 1995; Keller et al.. Molecular 
35 and Cellular Biology 13:473-486, 1993; McClanahan etal.. Blood 81:2903-2915, 1993. * 
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Assays for stem cell survival and differentiation (which will identify, among others, 
proteins that regulate lympho-hematopoiesis) include, without limitation, those described in: 
Methylcellulose colony forming assays, Freshney. M. G. In Culture of Hematopoietic Cells. R. I. 
Freshney, et al. eds. Vol pp. 265-268, Wiley-Liss, Inc.. New York, N.Y. 1994; Hirayama et al., 

5 Proc. Natl. Acad. Sci, USA 89:5907-5911 , 1992; Primitive hematopoietic colony forming cells 
with high proliferative potential, McNiece, I. K. and Briddell, R. A. In Culture of Hematopoietic 
Cells. R. I. Freslmey, et al. eds. Vol pp. 23-39, Wiley-Liss, Inc., New York, N.Y. 1994; l^eben et 
al.. Experimental Hematology 22:353-359, 1994; Cobblestone area forming cell assay, 
Ploemacher, R. E. In Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 1-21, 

10 Wiley-Liss, Inc., New York, N.Y. 1994; Long term bone marrow cultures in the presence of 
stromal cells, Spooncer, E., Dexter, M. and Allen, T. In Culture of Hematopoietic Cells. R. I. 
Freshney, et al. eds. Vol pp. 163-179. Wiley-Liss, Inc., New York, N.Y. 1994; Long term cultme 
initiating cell assay, Sutherland, H. J. In Culture of Hematopoietic Cells. R. I. Freshney, et al. 
eds. Vol pp. 139-162, Wiley-Liss, Inc., New York, N.Y. 1994. 

15 

4.10.6 TISSUE GROWTH ACTIVITY 

A polypeptide of the present invention also may be involved in bone, cartilage, tendon, 
ligament and/or nerve tissue growth or regeneration, as well as in wound healing and tissue 
repair and replacement, and in healing of bums, incisions and ulcers. 

20 A polypeptide of the present invention which induces cartilage and/or bone grovs^h in 

circumstances where bone is not normally formed, has application in the healing of bone 
fractures and cartilage damage or defects in humans and other animals. Compositions of a 
polypeptide, antibody, binding partner, or other modulator of the invention may have 
prophylactic use in closed as well as open fracture reduction and also in the improved fixation of 

25 artificial joints. De novo bone formation induced by an osteogenic agent contributes to the repahr 
of congenital, trauma induced, or oncologic resection induced craniofacial defects, and also is 
useful in cosmetic plastic surgery. 

A polypeptide of this invention may also be involved in attracting bone-forming cells, 
stimulating growth of bone-fonning cells, or inducing differentiation of progenitors of 

30 bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone degenerative disorders, or 
periodontal disease, such as tlirough stimulation of bone and/or cartilage repair or by blocking 
inflammation or processes of tissue destruction (collagenase activity, osteoclast activity, etc.) 
mediated by inflammatory processes may also be possible using the composition of the 
invention. ^ 
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Another category of tissue regeneration activity that may involve the poiypeptidc'Cf the 
present invention is tendon/ligament formation. Induction of tendon/ligament-like tissue or 
other tissue formation in circumstances where such tissue is not normally formed, has application 
in the healing of tendon or ligament tears, deformities and other tendon or ligament defects in 
5 humans and other animals. Such a preparation employing a tendon/ligament-Iike tissue inducing 
protein may have prophylactic use in preventing damage to tendon or ligament tissue, as well as 
use in the improved fixation of tendon or ligament to bone or other tissues, and in repairmg 
defects to tendon or ligament tissue. De novo tendon/ligament-Iike tissue formation induced by 
a composition of the present invention contributes to the repair of congenital, trauma induced, or 
1 0 other tendon or ligament defects of other origin, and is also useful in cosmetic plastic surgery for 
attachment or repair of tendons or Jigameiits. The compositions of the present invention may 
provide environment to attract tendon- or ligament-forming cells, stimulate growth of tendon- or 
ligament-forming cells, induce differentiation of progenitors of tendon- or ligament-forming 
cells, or induce growth of tendon/ligament cells or progenitors ex vivo for return in vivo to effect 

1 5 tissue repair. The compositions of the invention may also be useful in the treatment of tendinitis, 
carpal tunnel syndrome and other tendon or ligament defects. The compositions may also include 
an appropriate matrix and/or sequestering agent as a carrier as is well known in the art. 

The compositions of the present invention may also be useful for proliferation of neural 
cells and for regeneration of nerve and brain tissue, i.e. for the treatment of central and peripheral 

20 nervous system diseases and neuropathies, as well as mechanical and traumatic disorders, wjych 
involve degeneration, death or trauma to neural cells ornerve tissue. More specifically, a 
composition may be used in the treatment of diseases of the peripheral nervous system, such as 
peripheral nerve injuries, peripheral neuropathy and localized neuropathies, and central nervous 
system diseases, such as Alzheimer's, Parkinson's disease, Huntington's disease, amyotrophic 

25 lateral sclerosis, and Shy-Drager syndrome. Further conditions which may be treated in 

accordance with the present invention include mechanical and traumatic disorders, such as spinal 
cord disorders, head trauma and cerebrovascular diseases such as stroke. Peripheral neuropatliies 
resulting firom chemotherapy or other medical therapies may also be treatable using a 
composition of the invention. 

30 Compositions of the invention may also be useful to promote better or faster closure of 

non-healing wounds, including without limitation pressure ulcers, ulcers associated with vascular 
insufficiency, surgical and traumatic wounds, and the like. 

Compositions of the present invention may also be involved in the generation or 
regeneration of other tissues, such as organs (including, for example, pancreas, liver, intestine, 

35 kidney, skin, endothelium), muscle (smooth, skeletal or cardiac) and vascular (including Vascular- 
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endothelium) tissue, or for promoting the growth of cells comprising such tissues. Partof the 

desired effects may be by inhibition or modulation of fibrotic scarring may allow normal tissue 

to regenerate, A polypeptide of the present invention may also exhibit angiogenic activity, 

A composition of the present invention may also be useful for gut protection or 

5 regeneration and treatment of lung or liver fibrosis, reper fusion injury in various tissues, and 

conditions resulting from systemic cytokine damage. 

A composition of the present invention may also be useful for promoting or inhibiting 

differentiation of tissues de oribed above from precursor tissues or cells; or for inhibiting- the 

growth of tissues described above, 

1 0 Therapeutic compositions of the invention can be used in the following; 

Assays for tissue generation activity include, without limitation, those described in: 

International Patent Publication No. WO95/16035 (bone, cartilage, tendon); International Patent 

Publication No. WO95/05846 (nerve, neuronal); International Patent Publication No. 

WO91/07491 (skin, endothelium). 

1 5 Assays for wound healing activity include, without limitation, those described in: Winter, 

Epidermal Wound Healing, pps. 71-112 (Maibach, H. I. and Rovee, D. T., eds.). Year Book 

Medical Publishers, Inc., Chicago, as modified by Eaglstcin and Mertz, J. Invest. Dermatol 

71:382-84(1978). 

20 4.10.7 IMMUNE STIMULATING OR SUPPRESSING ACTIVITY 

A polypeptide of the present invention may also exhibit immune stimulating or immune 
suppressing activity, including without limitation the activities for which assays are described 
herein. A polynucleotide of the invention can encode a polypeptide exhibiting such activities, A 
protein may be useful in the treatment of various immune deficiencies and disorders (including 

25 severe combined immunodeficiency (SCID)), e.g., in regulating (up or down) growth and 

proliferation of T and/or B lymphocytes, as well as effecting the cytolytic activity of NK cells 
and other cell populations. These immune deficiencies may be genetic or be caused by viral (e.g., 
HIV) as well as bacterial or fungal infections, or may result from autoimmune disorders. More 
specifically, infectious diseases causes by viral, bacterial, fungal or other infection may be 

30 treatable using a protein of the present invention, including infections by HIV, hepatitis viruses, 
herpes viruses, mycobacteria, Leishmania spp., malaria spp. and various fungal infections such 
as candidiasis- Of course, in tliis regard, proteins of the present invention may also be useful 
where a boost to the immune system generally may be desirable, i.e., in the treatment of cancer. 
Autoimmune disorders which may be treated using a protein of the present invention 

35 include, for example, connective tissue disease, multiple sclerosis, systemic lupus erythematosus, 
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rheumatoid arthritis, autoimmune pulmonary inflammation, Guillain-Barre syndrome^ ' 
autoimmune thyroiditis, insulin dependent diabetes mellitis, myasthenia gravis, graft-versus-host 
disease and autoimmune inflammatory eye disease. Such a protein (or ajatagonists thereof, 
including antibodies) of the present invention may also to be useful in the treatment of allergic 
reactions and conditions (e.g., anaphylaxis, serum sickness, drug reactions, food allergies, insect 
venom allergies, mastocytosis, allergic rhinitis, hypersensitivity pneumonitis, urticaria, * 
angioedema, eczema, atopic dermatitis, allergic contact dermatitis, erythema multiforme^ 
Stevens- Johnson syndrome, allergic conjunctivitis, atopic keratoconjunctivitis, venereal 
keratoconjunctivitis, giant papillary conjunctivitis and contact allergies), such as asthma 
(particularly allergic asthma) or other respiratory problems. Other conditions, in which immune 
suppression is desired (including, for example, organ transplantation), may also be treatable 
using a protein (or antagonists thereof) of the present invention. The therapeutic effects of the 
polypeptides or antagonists thereof on allergic reactions can be evaluated by in vivo animals 
models such as the cumulative contact enliajicement test (Lastbom et ah. Toxicology 125: 59-66, 
1998), skin prick test (Hoffmann et ah. Allergy 54: 446-54, 1999), guinea pig skin sensitization 
test (Vohr et al.. Arch. ToxocoL 73: 501-9), and murine local lymph node assay (Kimber et aL, 
J. Toxicol. Environ. Health 53: 563-79), 

Using the proteins of the invention it may also be possible to modulate immune 
responses, in a number of ways. Down regulation may be in the form of inhibiting or blocking an 
immune response already in progress or may involve preventing the induction of an immuna^ ' 
response. The functions of activated T cells may be inhibited by suppressing T cell responses or 
by inducing specific tolerance in T cells, or both. Immunosuppression of T cell responses is 
generally an active, non-antigen-specific, process which requires continuous exposure of the T 
cells to the suppressive agent. Tolerance, which involves inducing non-responsiveness or anergy 
in T cells, is distinguishable from immanosuppression in that it is generally antigen-specific and 
persists after exposure to the tolerizing agent has ceased. Operationally, tolerance can be 
demonstrated by the lack of a T cel l response upon reexposure to specific antigen in the absence 
of the tolerizing agent. 

Down regulating or preventing one or more antigen functions (including without 
limitation B lymphocyte antigen functions (such as, for example, B7)), e.g., preventing high 
level lymphokine synthesis by activated T ceils, will be useful in situations of tissue, skin and 
organ transplantation and in graft-versus-host disease (GVHD). For example, blockage of T cell 
function should result in reduced tissue destruction in tissue transplantation. Typically, in tissue 
transplants, rejection of the transplant is initiated through its recognition as foreign by T cells, 
followed by an immune reaction that destroys the transplant. The administration of a therlpeutic 
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Composition of the invention may prevent cytokine synthesis by immune cells, such a&.T cellS;, 
and thus acts as an immunosuppressant. Moreover, a lack of costimulation may also be sufficient 
to anergize tlie T cells, thereby inducing tolerance in a subject. Induction of long-term tolerance 
by B lymphocyte antigen-blocking reagents may avoid the necessity of repeated administration 
5 of these blocking reagents. To achieve sufficient immunosuppression or tolerance in a subject, it 
may also be necessary to block the function of a combination of B lymphocyte antigens. 

The ejfficacy of particular therapeutic compositions in preventing organ transplani 
rejection or GVHD can be assessed using animal models that are predictive of efficacy in 
humans. Examples of appropriate systems which can be used include allogeneic cardiac grafts in 

1 0 rats and xenogeneic pancreatic islet cell grafts in mice, both of which have been used to examine 
the immunosuppressive effects of CTLA4Ig fusion proteins in vivo as described in Lenschow et 
aL, Science 257:789-792 (1992) and Turka et ah, Proc, NatL Acad Sci USA, 89:1 1 102-1 1 105 
(1992). In addition, murine models of GVHD (see Paul ed.. Fundamental Immunology, Raven 
Press, Nev^r York, 1989, pp. 846-847) can be used to determine the effect of therapeutic 

1 5 compositions of the invention on the development of that disease. 

Blocking antigen function may also be therapeutically useful for treating autoimmxine 
diseases. Many autoimmune disorders are the result of inappropriate activation of T cells that are 
reactive against self tissue and which promote the production of cytokines and autoantibodies 
involved in the pathology of the diseases. Preventing the activation of autoreactive T cells may 

20 reduce or eliminate disease symptoms. Administration of reagents which block stimulation 0f T 
cells can be used to inhibit T cell activation and prevent production of autoantibodies or T 
cell-derived cytokines which may be involved in the disease process. Additionally, blocking 
reagents may induce antigen-specific tolerance of autoreactive T cells which could lead to 
long-term relief from the disease. The efficacy of blocking reagents in preventing or alleviating 

25 autoimmune disorders can be determined using a number of well-characterized animal models of 

s 

human autoimmune diseases. Examples include murine experimental autoimmune encephalitis, 
systemic lupus erythmatosis in MRL/lpr/lpr mice or NZB hybrid mice, murine autoimmune 
collagen arthritis, diabetes mellitus in NOD mice and BB rats, and murine experimental 
myasthenia gravis (see Paul ed., Fundamental Inmiunology, Raven Press, New York, 1989, pp. 
30 840-856). 

Upregulation of an antigen function (e.g., a B lymphocyte antigen function), as a means 
of up regulating immune responses, may also be useful in therapy. Upregulation of immune 
responses may be in the form of enhancing an existing immune response or eliciting an initial 
immune response. For example, enhancing an immune response may be useful in cases of viral 
35 infection, including systemic viral diseases such as influenza, the common cold, and encephalitis. 
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Alternatively, anti- viral immune responses may be enhanced in an infected patient by 
removing T cells from the patient, costimulating the T cells in vitro with viral antigen-pulsed 
APCs eitlier expressing a peptide of the present invention or together with a stimulatory form of 
a soluble peptide of the present invention and reintroducing the in vitro activated T cells into the 
5 patient. Another method of enhancing anti-viral immune responses would be to isolate infected 
cells from a patient, transfect them with a nucleic acid encoding a protein of the present 
invention as described herein such that the cells express all or a portion of the protein on f heir 
surface, and reintroduce the transfected cells into the patient. The infected cells would now be 
capable of delivering a costimulatory signal to, and thereby activate, T cells in vivo. 

10 A polypeptide of the present invention may provide the necessary stimulation signal to T 

cells to induce a T cell mediated immune response against the transfected tumor cells. In 
addition, tximor cells which lack MHC class I or MHC class 11 molecules, or which fail to 
reexpress sufficient mounts of MHC class I or MHC class 11 molecules, can be transfected with 
nucleic acid encoding all or a portion of (e.g., a cytoplasmic-domain truncated portion) of an 

15 MHC class I alpha chain protein and P2 microglobulin protein or an MHC class U alpha chain 
protein and an MHC class II beta chain proteiii to thereby express MHC class I or MHC class 11 
proteins on the cell surface. Expression of the appropriate class I or class II MHC in conjunction 
with a peptide having the activity of a B lymphocyte antigen (e.g., B7-1, B7-2, B7-3) induces a T 
cell mediated immune response against the transfected tumor cell. Optionally, a gene encoding 

20 an antisense construct which blocks expression of an MHC class II associated protein, such ^s 
the invariant chain, can also be cotransfected with a DNA encoding a peptide having the activity 
of a B lymphocyte antigen to promote presentation of tumor associated antigens and induce 
tumor specific immunity. Thus, the induction of a T cell mediated immune response in a human 
subject may be sufficient to overcome tumor-specific tolerance in the subject. 

25 The activity of a protein of the invention may, among other means, be measured by the 

following methods: 

Suitable assays for thymocyte or splenocyte cytotoxicity include, without limitation, 
those described in: Current Protocols in Immunology, Ed by J. E. CoHgan, A. M. Kxuisbeek, D. 
H. Margulies, E, M. Shevach, W. Strober, Pub. Greene Publishmg Associates and 

30 • Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 
Chapter 7, Immunologic studies in Humans); Herrmaim et aL, Proc, Natl. Acad. Sci. USA 
78:2488-2492, 1981; Herrmann et aL, J. Immunol. 128:1968-1974, 1982; Handa et al., L 
Immunol- 135:1564-1572, 1985; Takai et al., L Immunol. 137:3494-3500, 1986; Takai et al., J. 
Immunol. 140:508-512, 1988; Bowman et al, J. Virology 61 :1992-1998; Bertagnolli et al, 

35 CeUular Immunology 133:327-341, 1991; Brown et al., J, Immunol. 153:3079-3092, 19^. 
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Assays for T-cell-dependent immunoglobulin responses and isotype switching (Which 
will identify, among others, proteins that modulate T-cell dependent antibody responses and that 
affect Thl/Th2 profiles) include, without limitation, those doscvibcd in; Maliszewski, J. 
Immunol. 144:3028-3033, 1990; and Assays for B cell function: In vitro antibody production^ 
5 Mond, J. J. and Brunswick, M. In Current Protocols in Inmiunology. J. E, e.a, Coligan eds. Vol 1 
pp. 3.8.1-3-8.16, John Wiley and Sons, Toronto. 1994, 

Mixed lymphocyte reaction (MLR) assays (which will identify, among others, proteins 
that generate predominantly Thl and CTL responses) include, without limitation, those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 
10 M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 
Humans); Takai et al., J. Immunol, 137:3494-3500, 1986; Takai et aL, J. Immunol. 140:508-512, 
1988; Bertagnolli et aL, J. Immunol. 149:3778-3783, 1992. 

Dendritic cell-dependent assays (which will identify, among others, proteins expressed by 
15 dendritic cells that activate naive T-cells) include, without limitation, those described in: Guery 
et al., J. Immunol. 134:536-544, 1995; Inaba et aL, Joumal of Experimental Medicine 
173:549-559, 1991; Macatonia et aL, Jotirnal of Immunology 154:5071-5079, 1995; Porgador et 
aL, Joumal of Experimental Medicine 182:255-260, 1995; Nair et aL, Joumal of Virology 
67:4062-4069, 1993; Huang et aL, Science 264:961-965, 1994; Macatonia et al., Joumal of 
20 Experimental Medicine 169:1255-1264, 1989; Bhardwaj et aL, Joumal of Clinical Investigation 
94:797-807, 1994; and Inaba et al., Journal of Experimental Medicine 172:631-640, 1990. 

Assays for lymphocyte sinrvival/apoptosis (which will identify, among others, proteins 
that prevent apoptosis after superantigen induction and proteins that regulate lymphocyte 
homeostasis) include, without limitation, those described in: Darzynkiewicz et aL, Cytometry 
25 13:795-808, 1992; Gorc2yca et al.. Leukemia 7:659-670, 1993; Gorczyca et ai,. Cancer Research 
53:1945-1951, 1993; Itoh et aL, Cell 66:233-243, 1991; Zacharchuk, Joumal of Immunology 
145:4037-4045, 1990; Zamai et al.. Cytometry 14:891-897, 1993; Gorczyca et al., International 
Joumal of Oncology 1:639-648, 1992, 

Assays for proteins that influence early steps of T-cell commitment and development 
30 include, without limitation, those described in: Antica et al., Blood 84:111-117, 1994; Fine et al.. 
Cellular Immunology 1 55:1 1 1-122, 1994; Galy et al.. Blood 85:2770-2778, 1995; Toki et aL, 
Proc. Nat. Acad Sci. USA 88:7548-7551, 199L 

4.10.8 ACTIVEV/INfflBIN ACTIVITY 
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A polypeptide of the present invention may also exhibit activin- or inhibm-reiated 
activities. A polynucleotide of the invention may encode a polypeptide exhibiting such 
characteristics. Inhibins are characterized by their ability to inhibit the release of follicle 
stimulating hormone (FSH), while activins and are characterized by their ability to stimulate the 
5 release of follicle stimulating hormone (FSH). Thus, a polypeptide of the present invention, 
alone or in heterodimers with a member of the inliibin family, may be useful as a contraceptive 
based on the ability of inhibins to decrease fertility in female mammals and decrease ^ 
spermatogenesis in male nammals. Administration of sufficient amounts of other inhibins can 
induce infertility in these mammals. Alternatively, the polypeptide of the invention, as a 
10 homodimer or as a heterodimer with other protein subunits of the inhibin group, may be useful as 
a fertility inducing therapeutic, based upon the ability of activin molecules in stimulating FSH 
release from cells of the anterior pituitary. See, for example, U.S. Pat. No. 4,798,885. A 
polypeptide of the invention may also be useful for advancement of the onset of fertility in 
sexually immature mammals, so as to increase the lifetime reproductive performance of domestic 
1 5 animals such as, but not limited to, cows, sheep and pigs. 

The activity of a polypeptide of the invention may, among other means, be measured by 

the following methods. 

Assays for activin/inhibin activity include, without limitation, those described in: Vale et 
al., Endocrmology 91:562-572, 1972; Ling et al.. Nature 321:779-782, 1986; Vale et al.. Nature 
20 321:776-779, 1986; Mason et al.. Nature 318:659-663, 1985; Forage et al., Proc. Natl. Acad^^Sci. 
USA 83:3091-3095,1986. 

4.10.9 CHEMOTACTIC/CHEMOKINETIC ACTIVITY 

A polypeptide of the present invention may be involved in chemotactic or chemokinetic 
25 activity for mammalian cells, including, for example, monocytes, fibroblasts, neutrophils, 
T-cells, mast cells, eosinophils, epithelial and/or endothelial cells. A polynucleotide of the 
invention can encode a polypeptide exhibiting such attributes. Chemotactic and chemokinetic 
receptor activation can be used to mobilize or attract a desired cell population to a desired site of 
action. Chemotactic or chemokinetic compositions (e.g. proteins, antibodies, binding partiiers, or 
30 modulators of the invention) provide particular advantages in treatinent of wounds and other 
trauma to tissues, as well as in ti-eatment of localized infections. For example, attraction of 
lymphocytes, monocytes or neuti-ophils to tumors or sites of infection may result in improved 
immune responses against the tumor or infecting agent. 

A protein or peptide has chemotactic activity for a particular cell population if it can 
3 5 stimulate, directiy or indirectiy. tiie directed orientation or movement of such cell population. 
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Preferably, the protein or peptide has the ability to directly stimulate directed movement 'of cells, 
Whetlier a particular protein has chemotactic activity for a population of cells can be readily 
determined by employing such protein or peptide in any known assay for cell chemotaxis. 
Therapeutic compositions of the invention can be used in the following: 
5 Assays for chemotactic activity (which will identify proteins that induce or prevent 

chemotaxis) consist of assays that measure the ability of a protein to induce the migration of cells 
across a membrane as well as the ability of a protein to induce the adhesion of one cell , 
population to another cell population. Suitable assays for movement and adhesion include, 
without limitation, those described in: Current Protocols in Immunology, Ed by J, E. Coligan, A. 
10 M. Kruisbeek, D. H. Marguiles, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates 
and 'Wiley-Interscience (Chapter 6.12, Measurement of alpha and beta Chemokines 
6,12,1-6.12.28; Taub et al. I Clin. Invest 95:1370-1376, 1995; Lind et al. APMIS 103:140-146, 
1995; Muller et al Eur. J. Immunol. 25:1744-1748; Gruber et al J. of InamunoL 152:5860-5867, 
1994; Johnston et al. J. of Immunol, 153:1762-1768, 1994. 

15 

4.10.10 HEMOSTATIC AND THROMBOLYTIC ACTIVXTY 

A polypeptide of the invention may also be involved in hemostatis or thrombolysis or 
thrombosis. A polynucleotide of the invention can encode a polypeptide exhibiting such 
attributes. Compositions may be useful m treatment of various coagulation disorders (including 

20 hereditary disorders, such as hemophilias) or to enhance coagulation and other hemostatic events 
in treating wounds resultuig from trauma, surgery or other causes. A composition of the 
invention may also be useful for dissolving or inhibiting formation of thromboses and for 
treatment and prevention of conditions resulting therefrom (such as, for example, infarction of 
cardiac and central nervous system vessels (e.g., stroke). 

25 Therapeutic compositions of the invention can be used in the following: 

Assay for hemostatic and thrombolytic activity include, without limitation, those 
described in: Linet et al., T Clin. Pharmacol. 26:13M40, 1986; Burdick et al. Thrombosis Res. 
45:413-419, 1987; Humphrey et al.. Fibrinolysis 5:71-79 (1991); Schaub, Prostaglandins 
35:467-474, 1988. 

30 

4.10-11 CANCER DIAGNOSIS AND THERAPY 

Polypeptides of the invention may be mvolved in cancer cell generation, proliferation or 
metastasis. Detection of the presence or amount of polynucleotides or polypeptides of the 
invention may be useful for the diagnosis and/or prognosis of one or more types of cancer. For 
35 example, the presence or increased expression of a polynucleotide/polypeptide of the invention 
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may indicate a hereditary risk of cancer, a precancerous condition, or an ongoing maligrxancy. 

Conversely, a defect in the geae or absence of the polypeptide may be associated with a cancer 

condition, Tdentifi cation of single nucleotide polymorphisms associated with cancer or a 

predisposition to cancer may also be useful for diagnosis or prognosis. 

5 Cancer treatments promote tumor regression by inhibiting tumor ceil proliferation, 

inhibiting angiogenesis (growth of new blood vessels that is necessary to support tmnor growth) 

and/or prohibiting metastasis by reducing tumor cell motility or invasiveness. Therapeutic 

compositions of the invention may be effective in adult and pediatric oncology including in solid 

phase tumors/malignancies, locally advanced tumors, himian soft tissue sarcomas, metastatic 

10 cancer, including lymphatic metastases, blood cell malignancies including multiple myeloma, 
acute and chronic leukemias, and lymphomas, head and neck cancers including mouth cancer, 
larynx cancer and thyroid cancer, lung cancers including small cell carcinoma and non-small cell 
cancers, breast cancers including small cell carcinoma and ductal carcinoma, gastrointestinal 
cancers including esophageal cancer, stomach cancer, colon cancer, colorectal cancer and polyps 

15 associated with colorectal neoplasia, pancreatic cancers, liver cancer, urologic cancers including 
bladder cancer and prostate cancer, maUgnancies of the female genital tract including ovarian 
carcinoma, uterine (including endometrial) cancers, and solid tumor in the ovarian follicle, 
kidney cancers including renal cell carcinoma, brain cancers including intrinsic brain tumors, 
neuroblastoma, astrocytic brain tumors, gliomas, metastatic tumor cell invasion in the central 

20 nervous system, bone cancers including osteomas, skin cancers including malignant melanoma, 
tumOT progression of human skin keratinocytes, squamous cell carcinoma, basal cell carcinoma^, 
hemangiopericytoma and Karposi's sarcoma. 

Polypeptides, polynucleotides, or modulators of polypeptides of the invention (including 
inhibitors and stimulators of the biological activity of the polypeptide of the invention) may be 

25 administered to treat cancer. Therapeutic compositions can be administered in therapeutically 
effective dosages alone or in combination with adjuvant cancer therapy such as surgery, 
chemotherapy, radiotherapy, thermotherapy, and laser therapy, and may provide a beneficial 
effect, e.g. reducing tumor size, slowing rate of tumor growth, inhibiting metastasis, or otherwise 
improving overall clinical condition, Mdthout necessarily eradicating the cancer. 

30 The composition can also be administered in therapeutically effective amounts as a 

portion of an anti-cancer cocktail. An anti-cancer coclctail is a mixture of the polypeptide or 
modulator of the invention vsdth one or more anti-cancer drugs in addition to a pharmaceutically 
acceptable carrier for delivery. The use of anti-cancer cocktails as a cancer treatment is routine. 
Anti-cancer drugs that are well known in the art and can be used as a treatment in combination 

35 with the polypeptide or modulator of the invention include: Actinomycin D, Aminoglutetffimide, 
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Asparaginase, Bleomycin, Busulfan, Carboplatin. Carmustiae, Chlorambucil, Cisplatin-(cis- 
DDP), Cyclophosphamide, Cytarabine HCl (Cytosine arabinoside), Dacarbazine, Dactinomycin, 
Daunorubicin HCl, Doxorubicin HCl. Estramustine phosphate sodium, Etoposide (V16-213), 
Floxuridine, 5-Fluorouracil (5-Fu), Flutamide, Hydroxyurea (hydioxycarbamide), Ifosfamide, 
5 Interferon Alpha-2a, Interferon Alpha-2b, Leuprolide acetate (LHRH-releasing factor analog), 
Lomustine, Mechlorethamine HCl (nitrogen mustard), Melphalan, Mercaptopurine, Mesna, 
Methotrexate (MTX), Mitomycin, Mitoxantrone HCl, Octreotide, Plicamycin. Procarbazine HCl, 
Streptozocin, Tamoxifen citrate, Thioguanine, Thiotepa, Vinblastine sulfate. Vincristine Sulfate, 
Amsacrine, Azacitidine, Hexamethylmelamine, Interleukin-2, Mitoguazone, Pentostatin, 
1 0 Semustine, Teniposide. and Vindesine sulfate. 

In addition, therapeutic compositions of the invention may be used for prophylactic 
treatment of cancer. There are hereditary conditions and/or environmental situations (e.g. 
exposure to carcinogens) known in the art that predispose an individual to developing cancers. 
Under these circumstances, it may be beneficial to treat these individuals with therapeutically , 
1 5 effective doses of the polypeptide of the invention to reduce the risk of developing cancers. 

In vitro models can be used to determine the effective doses of the polypeptide of the 
invention as a potential cancer treatment. These in vitro models include proliferation assays of 
cultured tumor cells, growth of cultured tumor cells in soft agar (see Freshney, (1987) Culture of 
Animal Cells: A Manual of Basic Technique. Wily-Liss, New York. NY Ch 18 and Ch 21), 
20 tumor systems in nude mice as described in Giovanella et al., J. Natl. Can. Inst., 52: 921-30 ' • 
(1974), mobility and invasive potential of tumor cells in Boyden Chamber assays as described in 
Pilkington et al.. Anticancer Res., 17: 4107-9 (1997), and angiogenesis assays such as induction 
of vascularization of the chick chorioallantoic membrane or induction of vascular endothelial 
cell migration as described in Ribatta et al., hitl. J. Dev. Biol.. 40: 1 189-97 (1999) and Li et al., 
25 Clin. Exp. Metastasis, 1 7:423-9 (1 999), respectively. Suitable tumor cells lines are available, 
e.g. from American Type Tissue Culture Collection catalogs. 

4.10.12 RECEPTOR/LIGAND ACTIVITY 

A polypeptide of the present invention may also demonstrate activity as receptor, 
30 receptor ligand or inhibitor or agonist of receptor/ligand interactions. A polynucleotide of the 
invention can encode a polypeptide exhibiting such characteristics. Examples of such receptors 
and ligands include, without limitation, cytokine receptors and their Ugands, receptor kinases and 
their hgands, receptor phosphatases and their ligands, receptors involved in cell-cell interactions 
and their ligands (including without limitation, cellular adhesion molecules (such as selectins, 
35 integrins and their, ligands) and receptor/ligand pairs involved in antigen presentation, antigen 
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recognition and development of cellular and humoral immune responses. Receptors and.ligands 
are also useful for screening of potential peptide or sraaU molecule inhibitors of the relevant 
receptor/ligand interaction. A protein of the present invention (including, without limitation, 
fragments of receptors and ligands) may themselves be useful as inhibitors of receptor/ligand 
5 interactions. 

The activity of a polypeptide of the invention may, among other means, be measured by 

the following methods: 

Suitable assays for receptor-ligand activity include without Iknitation those described in: 

Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. 
1 0 Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley- Interscience (Chapter 7.28, 

Measurement of Cellular Adhesion under static conditions 7.28.1- 7.28.22), Takai et al., Proc. 

Natl. Acad. Sci. USA 84:6864-6868, 1987; Bierer et aL, J. Exp. Med. 168:1 145-1 156, 1988; 

Rosenstein et aL, J, Exp. Med. 169:149-160 1989; Stoltenborg et al., J. Immunol. Metliods 

175:59-68, 1994; Stitt et al.. Cell 80:661-670, 1995. 
15 By way of example, the polypeptides of the invention may be used as a receptor for a 

ligand(s) thereby transmitting the biological activity of that ligand(s). Ligands may be identified 

through binding assays, affinity chromatography, dihybrid screening assays, BI Acore assays, gel 

overlay assays, or other methods known in the art. 

Studies characterizing drugs or proteins as agonist or antagonist or partial agonists or a 
20 partial antagonist require the use of other proteins as competing ligands. The polypeptides o^'tiie 

present invention or ligand(s) thereof may be labeled by being coupled to radioisotopes, 

colorimetric molecules or a toxin molecules by conventional methods. ("Guide to Protein 

Purification" Murray P. Deutscher (ed) Methods in Enzymology Vol. 182 (1990) Academic 

Press, Inc. San Diego). Examples of radioisotopes include, but are not limited to, tritium and 
25 carbon-14 . Examples of colorimetric molecules include, but are not limited to, fluorescent j ' 

molectdes such as fluorescamine, or rhodamine or other colorimetric molecules. Examples of 

toxins include, but are not limited, to ricin. 



4,10.13 DRUG SCREENING 

30 This invention is particularly useful for screening chemical compounds by using the 

novel polypeptides or binding fragments thereof in any of a variety of drug screening techniques. 
The polypeptides or fragments employed in such a test may either be free m solution, affixed to a 
solid support, borne on a cell surface or located intracellularly. One method of drug screening 
utilizes eukaryotic or prokaryotic host ceils which are stably transformed with recombinant 
35 nucleic acids expressing the polypeptide or a fragment thereof. Drugs are screened agairfst such 
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transformed cells in competitive binding assays. Such cells, either in viable or fixed form, can 
be used for standard binding assays. One may measure, for example, the formation of complexes 
between polypeptides of the invention or fragments and the agent being tested or examine the 
diminution in complex formation between the novel polypeptides and an appropriate cell line, 
5 which are well known in the art. 

Sources for test compounds that may be screened for ability to bind to or modulate (i.e., 
increase or decrease) the activity of polypeptides of the invention include (I) inorganic aiid 
organic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 
comprised of either random or mimetic peptides, oligonucleotides or organic molecules, 

1 0 Chemical libraries may be readily synthesized or purchased from a number of 

conunercial sources, and may include structural analogs of known compounds or compounds 
that are identified as "hits" or "leads" via natural product screening. 

The sources of nattiral product libraries are microorganisms (including bacteria and 
fimgi), animals^ plants or other vegetation, or marine organisms, and libraries of mixtures for 

15 screening may be created by; (1) fermentation and extraction of broths from soil, plant or marine 
microorganisms or (2) extraction of the organisms themselves. Natural product libraries include 
polyketides, non-ribosomal peptides, and (non-naturally occurring) variants thereof. For a 
review, see Science 282:63-6^ (1998), 

Combinatorial libraries are composed of large numbers of peptides, oligonucleotides or 

20 organic compounds and can be readily prepared by traditional automated synthesis method^^. 
PCR, cloning or proprietary synthetic methods. Of particular interest are poptide and 
oligonucleotide combinatorial libraries. Still other libraries of interest include peptide, protein, 
peptidomimetic, multiparallel synthetic collection, recombinatorial, and polypeptide libraries. 
For a review of combinatorial chemistry and libraries created therefrom, see Myers, Curr, Opin, 

25 Biotechnol 8:701-707 (1997). For reviews and examples of peptidomimetic libraries, see 
Al-Obeidi et al., Mol Biotechnol, 9(3):205-23 (1998); Hruby et al., Curr Opin Chem Biol, 
1(1):1 14-19 (1997); Domer et al., Bioorg Med Chem, 4(5):709-l 5 (1996) (alkylated dipeptides). 

Identification of modulators through use of the various libraries described herein permits 
modification of the candidate "hif (or "lead") to optimize the capacity of the "hit" to bind a 

30 polypeptide of the invention. The molecules identified in the binding assay are then tested for 

antagonist or agonist activity in in vivo tissue culture or animal models that are well known in the 
art. In brief, the molecules are titrated uito a plurality of cell cultures or animals and then tested 
for either cell/animal death or prolonged survival of the animal/cells. 

The binding molecules thus identified may be complexed with toxins, e.g., ricin or 

35 cholera, or with other compounds that are toxic to cells such as radioisotopes. The toxin-binding 
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molecule complex is then targeted to a tumor or other cell by the specificity of the binding 
inolecuie for a polypeptide of the invention. Alternatively, the binding molecules may be 
compJexed with imaging agents for targeting and imaging purposes. 

5 4.10.14 ASSAY FOR RECEPTOR ACTIVITY 

Tlie invention also provides methods to detect specific binding of a polypeptide e.g. a 
ligand or a receptor. The art provides numerous assays particularly useful for identifying ' 
previously unknown binding partners for receptor polypeptides of the .invention. For example, 
expression cloning using mammalian or bacterial cells, or dihybrid screening assays can be used 

10 to identify polynucleotides encoding binding partners. As another example, affinity 

chromatography with the appropriate immobilized polypeptide of the invention can be used to 
isolate polypeptides that recognize and bind polypeptides of the invention. There are a number 
of different libraries used for the identification of compoimds, and in particular small molecules, 
that modulate (i.e., increase or decrease) biological activity of a polypeptide of the invention. 

15 Ligands for receptor polypeptides of the invention can also be identified by adding exogenous 
ligands, or cocktails of ligands to two cells populations that are genetically identical except for 
the expression of the receptor of the invention: one eel! population expresses the receptor of the 
invention whereas the other does not. The response of the two cell poptilations to the addition of 
ligands(s) are then compared. Alternatively, an expression library can be co-expressed with the 

20 polypeptide of the invention in cells and assayed for an autocrine response to identify potential 
Ugand(s). As still another example, BIAcore assays, gel overlay assays, or other methods known 
in the art can be used to identify binding partner polypeptides, including, (1) organic and 
inorganic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 
comprised of random peptides, oligonucleotides or organic moiecules. 

25 The role of downstream intracellular signaling molecules in the signaling cascade of the 

polypeptide of the invention can be determined- For example, a chimeric protein in which the 
cytoplasmic domain of the polypeptide of the invention is fused to the extracellular portion of a 
protein, whose ligand has been identified, is produced in a host cell The cell is then incubated 
with the ligand specific for the extracellular portion of the chimeric protein, thereby activating 

30 the chimeric receptor. Known downstream proteins involved in intracellular signaling can then 
be assayed for expected modifications i.e. phosphorylation. Other methods known to those in the 
art can also be used to i4entify signaling molecules involved in receptor activity. 

4.10.15 ANTI-INFLAMMATORY ACTIVITY 
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Compositions of the present invention may also exhibit anti-inflammatory activity. The 
anti-inflammatory activity may be achieved by providing a stimulus to cells involved in the 
inflammatory response, by inhibiting or promoting cell-cell interactions (such as, for example, 
cell adhesion), by inhibiting or promoting chemotaxis of cells involved in the inflammatory 
5 process, inhibiting or promoting cell extravasation, or by stimulating or suppressing production 
of other factors which more directly inhibit or promote an inflammatory response. Compositions 
with such activities can be used to treat inflammatory conditions including chronic or acute 
conditions), including without limitation intimation associated with infection (such as septic 
shock, sepsis or systemic inflammatory response syndrome (SIRS)), ischemia-reperfusion injuiy, 

10 endotoxin lethality, arthritis, complement-mediated hyperacute rejection, nephritis, cytokine or 
chemokine-induced lung injury, inflammatory bowel disease, Crohn's disease or resulting fronn 
over production of cytokines such as TNF or IL-L Compositions of the invention may also be 
useful to treat anaphylaxis and hypersensitivity to an antigenic substance or material. 
Compositions of this invention may be utilized to prevent or treat conditions such as, but not 

1 5 limited to, sepsis, acute pancreatitis, endotoxin shock, cytokine induced shock, rheumatoid 

arthritis, chronic inflammatory arthritis, pancreatic cell damage from diabetes mellitus type 1, 
graft versus host disease, inflammatory bowel disease, inflamation associated with pulmonary 
disease, other autoimanune disease or inflammatory disease, an antiproliferative agent such as for 
acute or chronic mylegenous leukemia or in the prevention of premature labor secondary to 

20 intrauterine infections. - 

4.10.16 LEUKEMIAS 

Leukemias and related disorders may be treated or prevented by administration of a 
therapeutic that promotes or inhibits function of the polynucleotides and/or polypeptides of the 
25 invention. Such leukemias and related disorders include but are not limited to acute leukemia, 
acute lymphocytic leukemia, acute myelocytic leukemia, myeloblastic, promyelocytic, 
myelomonocytic, monocytic, erythroleukemia, chronic leiikemia, chronic myelocytic 
(granulocytic) leukemia and chronic lymphocytic leukemia (for a review of such disorders, see 
Fishman et aL, 1985, Medicine, 2d Ed., J.B. Lippincott Co., Philadelphia). 

30 

4.10.17 NERVOUS SYSTEM DISORDERS 

Nervous system disorders, involving cell types which can be tested for efflcacy of 
intervention with compounds that modulate the activity of the polynucleotides and/or 
polypeptides of the invention, and which can be treated upon thus observing an indication of 
35 therapeutic utility, uiclude but are not limited to nervous system injuries, and diseases or 
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disorders which result in either a disconnection of axons, a diminution or degeneration pf 
neurons, or demyelination. Nervous system lesions which may be treated in a patient (including 
human and non-human mammalian patients) according to the invention include but are not 
hmited to the following lesions of either the central (including spinal cord, brain) or peripheral 
5 nervous systems: 

(i) traumatic lesions, including lesions caused by physical injury or associated with 
surgery, for example, lesions which sever a portion of the nervous system, or compression 
injuries; 

(ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous system 
1 0 results in neuronal injury or death, including cerebral infarction or ischemia, or spinal cord 

infarction or ischemia; 

(iii) infectious lesions, in which a portion of the nervous system is destroyed or injured 
as a result of infection, for example, by an abscess or associated with infection by human 
immunodeficiency virus, herpes zoster, or herpes simplex virus or with Lyme disease, 

15 tuberculosis, syphilis; 

(iv) degenerative lesions, in which a portion of the nervous system is destroyed or 
injured as a resuh of a degenerative process including but not limited to degeneration associated 
with Parkinson's disease, Alzheimer's disease, Huntington's chorea, or amyotrophic lateral 
sclerosis; 

20 (v) lesions associated with nutritional diseases or disorders, in which a portion of^the 

nervous system is destroyed or injured by a nutritional disorder or disorder of metabolism 
including but not limited to, vitamin B 12 deficiency, folic acid deficiency, Wernicke disease, 
tobacco-alcohol amblyopia, Marchiafava-Bignami disease (primary degenemtion of the corpus 
callosum), and alcoholic cerebellar degeneration; 

25 (vi) neurological lesions associated with systemic diseases including but not limited to 

diabetes (diabetic neuropathy, Bell's palsy), systemic lupus erythematosus, carcinoma, or 
sarcoidosis; 

(vii) lesions caused by toxic substances including alcohoJ, lead, or particular 
neurotoxins; and 

30 (viii) demyelinated lesions m which a portion of the nervous system is destroyed or 

injured by a demyelinating disease including but not limited to multiple sclerosis, human 
immunodeficiency virus-associated myelopathy, transverse myelopathy or various etiologies, 
progressive multifocal leukoencephalopathy, and central pontine myelinolysis* 

Therapeutics which are useful according to the invention for treatment of a nervous 

35 system disorder may be selected by testing for biological activity in promoting the survi^fal or 

BNSDOCIO: <WO 0153312A1„L> 



wo 01/33312 PCT/USOO/34263 
differentiation of neurons. For example, and not by way of limitation, therapeutics which elicit 
any of the following effects may be useful according to the invention; 

(i) increased survival time of neurons in culture; 

(ii) increased sprouting of neurons in culture or in vivo; 

5 (iii) increased production of a neuron-associated molecule in culture or w vivo, eg., 

choline acetyltransferase or acetylcholinesterase with respect to motor neurons; or 
(iv) decreased symptoms of neuron dysfunction in vivo. 
Such effects may be measured by any method known in the art. In preferred, 
non-limiting embodiments, increased survival of neurons may be measured by the method set 

1 0 forth in Arakawa et al. (1990, J. Neurosci. 1 0:3507-35 1 5); increased sprouting of neurons may 
be detected by methods set forth in Pestronk et al. (1980, Exp, Neurol 70:65-82) or Brown et al 
(1981, Ann. Rev. Neurosci. 4:17-42); increased production of neuron-associated molecules may 
be measured by bioassay, enzymatic assay, antibody binding. Northern blot assay, etc., 
depending on the molecule to be measured; and motor neuron dysfunction may be measured by 

1 5 assessing the physical manifestation of motor neuron disorder, e.g. , weakness, motor neuron 
conduction velocity, or functional disability. 

In specific embodiments, motor neuron disorders that may be treated according to the 
invention include but are not limited to disorders such as infarction, infection, exposure to toxin, 
trauma, surgical damage, degenerative disease or malignancy that may aflfect motor neurons as 

20 well as other components of the nervous system, as well as disorders that selectively affect 

neurons such as amyotrophic lateral sclerosis, and including but not limited to progressive spinal 
muscular atrophy, progressive bulbar palsy, primary lateral sclerosis, infantile and juvenile 
muscular atrophy, progressive bulbar paralysis of childhood (Fazio-Londe syndrome), 
poliomyelitis and the post polio syndrome, and Hereditary Motorsensory Neuropathy 

25 (Charcot-Marie-Tooth Disease). 

4.10.18 OTHER ACTIVITIES 

A polypeptide of the invention may also exhibit one or more of the following additional 
activities or effects: inhibiting the growth, infection or fimction of, or killing, infectious agents, 
30 including, without limitation, bacteria, viruses, fungi and other parasites; eflfecting (suppressing 
or enhancing) bodily characteristics, including, without limitation, height, weight, hair color, eye 
color, skin, fat to lean ratio or other tissue pigmentation, or organ or body part size or shape 
(such as, for example, breast augmentation or diminution, change in bone form or shape); 
effecting biorhythms or circadian cycles or rhythms; effectuig the fertility of male or female 
35 subjects; effecting tide metabolism, catabolism, anabolisra, processing, utilization, storage or 
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elimination of dietary fat, lipid, protein, carbohydrate, vitamins, minerals, co-factors or other 
nutritional factors or component(s); effecting behavioral characteristics, including, without 
limitation, appetite, libido, stress, cognition (including cognitive disorders), depression 
(including depressive disorders) and violent behaviors; providing analgesic effects or otlier pain 
5 reducing effects; promoting differentiation and growth of embryonic stem cells in lineages other 
than hematopoietic lineages; hormonal or endocrine activity; in the case of enzymes, correcting 
deficiencies of the enzyme and treating deficiency-related diseases; treatment of 
hyperproliferative disorders (such as, for example, psoriasis); inununoglobulin-Iike activity (such 
as, for example, the ability to bmd antigens or complement); and the ability to act as an antigen 
10 in a vaccine composition to raise an immune response against such protein or another material or 
entity which is cross-reactive with such protein. 



4.10.19 IDENTIFICATION OF POLYMORPHISMS 

The demonstration of polymorphisms makes possible the identification of such 
15 polymorphisms in human subjects and the pharmacogenetic use of this information for diagnosis 
and treatment. Such polymorphisms may be associated with, e.g.j differential predisposition or 
susceptibility to various disease states (such as disorders involving inflammation or immune 
response) or a differentid response to drug administration, and this genetic information can be 
used to tailor preventive or therapeutic treatment appropriately. For example, the existence of a 
20 polymorphism associated with a predisposition to inflammation or autoimmune disease nial^s 
possible the diagnosis of this condition in humans by identifying the presence of the 
polymorphism. 

Polymorphisms can be identified in a variety of ways known in the art which all 
generally involve obtaining a sample from a patient, analyzing DNA fi*om the sample, optionally 

25 involving isolation or amplification of the DNA, and identifying the presence of the 

polymorphism in the DNA, For example, PGR may be used to amplify an appropriate fragment 
of genomic DNA which may then be sequenced. Alternatively, the DNA may be subjected to 
allele-specific oligonucleotide hybridization (in which appropriate oligonucleotides are 
hybridized to the DNA under conditions permitting detection of a single base mismatch) or to a 

30 single nucleotide extension assay (in which an oligonucleotide that hybridizes hnmediately 

adjacent to the position of the polymorphism is extended with one or more labeled nucleotides). 
In addition, traditional restriction fragment length polymorphism analysis (using restriction 
enzymes that provide differential digestion of the genomic DNA depending on the presence or 
absence of the polymorphism) may be performed. Arrays with nucleotide sequences of the 

35 present invention can be used to detect polymorphisms. The array can comprise modifidfi 
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nucleotide sequences of the present invention in order to detect the nucleotide sequences' of the 
present invention. In the alternative, any one of the nucleotide sequences of the present 
invention can be phi \ on the array to detect changes from those sequences. 

Altemativeh a polymorphism resulting in a change in the amino acid sequence could 
5 also be detected by detecting a corresponding change in amino acid sequence of the protein, e.g.^ 
by an antibody specific to the variant sequence. 

4.10.20 ARTHRITIS AND INFLAMMATION 

The immunosuppressive effects of the compositions of the invention against rheumatoid 
20 arthritis is determined in an experimental animal model system. The experimental model system 
is adjuvant induced arthritis in rats, and the protocol is described by J. Holoshitz, et at., 1983, 
Science, 219:56, or by B. Waksman et ah, 1963, Int Arch. Allergy Appl. Immunol, 23:129. 
Induction of the disease can be caused by a single injection, generally intradermally, of a 
suspension of killed Mycobacterixam tuberculosis in complete Freund's adjuvant (CFA). The 
1 5 route of injection can vary, but rats may be injected at the base of the tail with an adjuvant 

mixture. The polypeptide is administered in phosphate buffered solution (PBS) at a dose of about 
1 -5 mg/kg. The control consists of administering PBS only. 

The procedure for testing the effects of the test compound would consist of intradermally 
injecting killed Mycobacterium tuberculosis in CFA followed by immediately administering the 
20 test compound and subsequent treatment every other day until day 24. At 14, 1 5, 18, 20, 22,^^d 
24 days after injection of Mycobacterium CFA, an overall arthritis score may be obtained as 
described by J. Holoskitz above. An analysis of the data w^ould reveal that the test compound 
w^ould have a dramatic affect on the swelling of the joints as measured by a decrease of the 
arthritis score. 

25 

4*11 THERAPEUTIC METHODS 

The compositions (including polypeptide fragments, analogs, variants and antibodies or 
other binding partners or modulators including antisense polynucleotides) of the invention have 
numerous applications in a variety of therapeutic methods. Examples of therapeutic applications 
30 include, but are not limited to, those exemplified herein. 

4.1 IJ EXAMPLE 

One embodiment of the invention is the administration of an effective amount of the 
polypeptides or other composition of the invention to individuals affected by a disease or 
35 disorder that can be modulated by regulating the peptides of the invention. While the mode of 
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administration is not particularly important, parenteral administration is preferred. An, ' 
exemplary mode of administration is to deliver an intravenous bolus. The dosage of the 
polypeptides or other composition of the invention will normally be deteimined by the 
prescribing physician. It is to be expected that the dosage will vary according to the age, weight, 
5 condition and response of the individual patient. Typically, the amount of polypeptide 

administered per dose will be in the range of about 0.01|ag/kg to 100 mg/kg of body weight, with 
the preferred dose being about O.ljag/kg to 10 mg/kg of patient body weight. For parenteral 
administration, polypeptides of the invention will be foimulated in an injectable form combined 
with a pharmaceutically acceptable parenteral vehicle. Such vehicles are well known in the art 
1 0 and examples include water, saline. Ringer's solution, dextrose solution, and solutions consisting 
of small amounts of the human serum albumin. The vehicle may contain minor amounts of 
additives that maintain the isotonicity and stability of the polypeptide or other active ingredient. 
The preparation of such solutions is within the skill of the art. 

15 4,12 PHARMACEUTICAL FORMULATIONS AND ROUTES OF 
ADMINISTRATION 

A protein or other composition of the present invention (from whatever source derived, 
mcluding without limitation from recombinant and non-recombinant sources and including 
antibodies and other binding partners of the polypeptides of the invention) may be administered 

20 to a patient in need, by itself, or hi pharmaceutical compositions where it is mixed with suitable 
carriers or excipient(s) at doses to treat or ameliorate a variety of disorders. Such a composition 
may optionally contain (in addition to protein or other active ingredient and a carrier) diluents, 
fillers, salts, buffers, stabilizers, solubilizers, and other materials well known in the art. The term 
"pharmaceutically acceptable" means a non-toxic material that does not interfere with the 

25 effectiveness of the biological activity of the active ingredient(s). The characteristics of the 
carrier will depend on the route of administration. The pharmaceutical composition of the 
invention may also contain cytokines, lymphokines, or other hematopoietic factors such as 
M-CSF, GM-CSF, T^F, IL-l, IL-2, IL-3, IL-4, IL-5, IL-6, IL^7, IL-8, IL^IO, IL-11, IL-12, 
IL-13, iL-14, IL-15, IFN, TNFO, TNFl, TNF2, G-CSF, Meg-CSF, thrombopoietin, stem cell 

30 factor, and erythropoietin. In frirther compositions, proteins of the invention may be combined 
with other agents beneficial to the treatment of the disease or disorder in question. These agents 
include various growth factors such as epidermal growth factor (EOF), platelet-derived growth 
factor (PDGF), transforming growth factors (TGF-a and TGF-p), insulinjike growth factor 
(IGF), as well as cytokines described herein. 
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The pharmaceutical composition may further contain other agents which either enhance 

the activity of the protein or other active ingredient or complement its activity or use in 

treatment. Such additional factors and/or agents may be included in the pharmaceutical 

composition to produce a synergistic effect with protein or other active ingredient of the 

5 invention, or to minimize side effects. Conversely, protein or otlier active ingredient of the 

present invention may be included in formulations of the particular clotting factor, cytokine, 

lymphokine, other hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti- ^ 

inflammatory agent to minimize side effects of the clotting factor, cytokine, lymphokine, other 

hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti-inflammatory agent (such as 

10 IL-lRa, IL-1 Hyl, IL-1 Hy2, anti-TNF, corticosteroids, immimosuppressive agents). A protein 
of the present invention may be active in multimers (e,g., heterodimers or homodimers) or 
complexes with itself or other proteins. As a result, pharmaceutical compositions of the 
invention may comprise a protein of the invention in such multimeric or complexed form. 

As an alternative to being included in a pharmaceutical composition of the invention 

15 including a first protein, a second protein or a therapeutic agent may be concurrently 

administered with the first protein (e.g., at the same time, or at differing times provided that 
therapeutic concentrations of the combination of agents is achieved at the treatment site). 
Techniques for formulation and administration of the compounds of the instant application may 
be found in "Remington's Pharmaceutical Sciences," Mack Publishing Co., Easton, PA, latest 

20 edition. A therapeutically effective dose further refers to that amoimt of the compound sufficient 
to resuh in amelioration of symptoms, e.g., treatment, healing, prevention or amelioration of the 
relevant medical condition, or an increase in rate of treatment, healing, prevention or 
amelioration of such conditions. When applied to an individual active ingredient, administered 
alone, a therapeutically effective dose refers to that ingredient alone. When applied to a 

25 combination, a therapeutically effective dose refers to combined amounts of the active 

ingredients that result in the therapeutic effect, whether administered in combination, serially or 
simultaneously. 

In practicing the method of treatment or use of the present invention, a therapeutically 
effective amoxmt of protein or other active ingredient of the present invention is administered to 

30 a mammal having a condition to be treated. Protein or other active ingredient of the present 

invention may be administered in accordance with the method of the invention either alone or in 
combination with other therapies such as treatments employing cytokines, lymphokines or other 
hematopoietic factors. When co- administered with one or more cytokines, lymphokines or other 
hematopoietic factors, protein or other active ingredient of the present invention may be 

35 administered either simultaneously with the cytokine(s), lymphokine(s), other hematopoietic 
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factor(s), thrombolytic or anti-throrabotic factors, or sequentially. If administered sequentially, 
the attending physician will decide on tlie appropriate sequence of administering protein or other 
active ingredient of the present invention in combination with cytokine(s), lymphokine(s), other 
hematopoietic factor(s), thrombolytic or anti-thrombotic factors. 

5 

4.12.1 ROUTES OF ADMINISTRATION 

Suitable routes of administration may, for example, include oral, rectal, transmucosal, or 
intestinal administration; parenteral delivery, including intramuscular, subcutaneous, 
intramedullary injections, as well as intrathecal, direct intraventricular, intravenous, 

10 intraperitoneal, intranasal, or intraocular injections. Administration of protein or other active 
ingredient of the present invention used in the pharmaceutical composition or to practice the 
method of the present invention can be carried out in a variety of conventional ways, such as oral 
ingestion, inhalation, topical application or cutaneous, subcutaneous, intraperitoneal, parenteral 
or intravenous injection. Intravenous administration to the patient is preferred. 

1 5 Alternately, one may administer the compoxmd in a local rather than systemic manner, for 

example, via injection of the compound directly into a arthritic joints or in fibrotic tissue, often in 
a depot or sustamed release formulation. In order to prevent the scarring process frequently 
occurring as complication of glaucoma surgery, the compounds may be administered topically, 
for example, as eye drops. Furtliermore, one may administer the drug in a targeted drug delivery 

20 system, for example, in a liposome coated with a specific antibody, targeting, for example, * 
arthritic or fibrotic tissue. The liposomes will be targeted to and taken up selectively by the 
afflicted tissue. 

The polypeptides of the invention are administered by any route that delivers an effective 
dosage to the deshed site of action. The determination of a suitable route of administration and 
25 an effective dosage for a particular indication is within the level of skill in the art. Preferably for 
wound treatment, one administers the therapeutic compound directly to the site. Suitable dosage 
ranges for the polypeptides of the invention can be extrapolated from these dosages or from 
similar studies in appropriate animal models. Dosages can then be adjusted as necessaiy by the 
clinician. to provide maximal therapeutic benefit, 

30 

4.12.2 COMPOSITIONS/FORMULATIONS 

Pharmaceutical compositions for use in accordance with the present invention thus may 
be formulated in a conventional manner using one or more physiologically acceptable carriers 
comprising excipients and auxiliaries which facilitate processing of the active compounds into 
35 preparations which can be used pharmaceutically. These pharmaceutical compositions mly be 
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manufactured in a maimer that is itself known, e.g., by means of conventional mixing, ^ 
dissolving, granulating, dragee-making, levigating, emulsifying, encapsulating, entrapping or 
lyophilizing processes. Proper formulation is dependent upon the route of administration chosen. 
When a therapeutically effective amount of protein or other active ingredient of the present 
5 invention is administered orally, protein or other active ingredient of the present invention will 
be in the form of a tablet, capsule, powder, solution or elixir. When administered in tablet form, 
tlie pharmaceutical composition of the invention may additionally contain a solid carrier such as 
a gelatin or an adjuvant. The tablet, capsule, and powder contain from about 5 to 95% protein or 
other active ingredient of the present invention, and preferably from about 25 to 90% protein or 

10 other active ingredient of the present invention. When administered in liquid form, a liquid 
carrier such as water, petroleimi, oils of animal or plant origin such as p&anut oil, mineral oil, 
soybean oil, or sesame oil, or synthetic oils may be added. The liquid form of the 
pharmaceutical composition may fiMher contam physiological saline solution, dextrose or other 
saccharide solution, or glycols such as ethylene glycol, propylene glycol or polyethylene glycol. 

15 When administered in liquid form, the pharmaceutical composition contains from about 0,5 to 
90% by weight of protein or other active ingredient of the present invention, and preferably from 
about 1 to 50% protein or other active ingredient of the present invention. 

When a therapeutically effective amount of protein or other active ingredient of the 
present invention is administered by intravenous, cutaneous or subcutaneous injection, protein or 

20 other active ingredient of the present invention will be in the form of a pyrogen-fr-ee, parenterally 
acceptable aqueous solution. The preparation of such parenterally acceptable protein or other 
active ingredient solutions, having due regard to pH, isotonicity, stability, and the like, is within 
the skill iia the art. A preferred pharmaceutical composition for iiitravenous, cutaneous, or 
subcutaneous injection should contain, in addition to protein or other active ingredient of the 

25 present invention, an isotonic vehicle such as Sodium Chloride Injection, Ringer's Injection, 
Dextrose Injection, Dextrose and Sodium Chloride Injection, Lactated Ringer's Injection, or 
other vehicle as known in the art. The pharmaceutical composition of the present invention may 
also contain stabilizers, preservatives, buffers, antioxidants, or other additives known to those of 
skill in the art. For injectioii, the agents of the invention may be formulated in aqueous solutions, 

30 preferably in physiologically compatible buffers such as Hanks's solution. Ringer's solution, or 
physiological saline buffer. For transmucosal administration, penetrants appropriate to the 
barrier to be permeated are used in the formulation. Such penetrants are generally known in the 
art. 

For oral administration, the compounds can be formulated readily by combining the 
35 active compounds v^nth pharmaceutically acceptable carriers well known in the art. Such carriers 
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enable the compoimds of the invention to be formulated as tablets, pills, dragees, capsules, 
liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by a patient to be 
treated. Pharmaceutical preparations for oral use can be obtained from a solid excipient, 
optionally grinding a resulting mixture, and processing the mixture of granules, after adding 
5 suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable excipients are, in 
particular, fillers such as sugars, including lactose, sucrose, mannitol, or sorbitol; cellulose 
preparations such as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, 
gum tragacanth, methy! cellulose, hydroxypropybnethyl-cellulose, sodium 
carboxymethylcellulose, and/or polyvmylpyrrolidone (PVP), If desired, disintegrating agents 
10 may be added, such as the cross-linked polyvinyl pyrrolidone, agar, or alginic acid or a salt 
thereof such as sodium alginate. Dragee cores are provided with suitable coatings. For this 
puipose, concentrated sugar solutions may be used, which may optionally contain gum arable, 
talc, polyvuiyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer 
solutions, and suitable organic solvents or solvent mixtures. Dyestuffs or pigments may be 
1 5 added to the tablets or dragee coatings for identification or to characterize different combinations 
of active compound doses. 

Pharmaceutical preparations which can be used orally mclude push-fit capsules made of 
gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as glycerol or 
sorbitol The push-fit capsules can contain the active ingredients in admixture with filler such as 
20 lactose, binders such as starches, and/or lubricants such as talc or magnesium stearate and, ^ 
optionally, stabilizers. In soft capsules, the active compounds may be dissolved or suspended in 
suitable liquids, such as fatty oils, liquid paraffin, or liquid polyethylene glycols. In addition, 
stabilizers may be added. All formulations for oral administration should be in dosages suitable 
for such administration. For buccal administration, the compositions may take the form of 
25 tablets or lozenges formulated in conventional manner. 

For administration by inhalation, the compounds for use according to the present 
invention are conveniently delivered in the form of an aerosol spray presentation irom 
pressurized packs or a nebuliser, with the use of a suitable propellant, e.g., 
dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or 
30 other suitable gas. In the case of a pressurized aerosol the dosage unit may be determined by 

providing a valve to deliver a metered amount Capsules and cartridges of, e.g., gelatin for use in 
an inhaler or insufflator may be formulated containing a powder mix of the compound and a 
suitable powder base such as lactose or starch. The compounds may be formulated for parenteral 
administration by injection, eg., by bolus injection or contmuous infusion. Formulations for 
35 injection may be presented in unit dosage form, e.g., in ampules or in multi-dose containA*s, with 
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an added preservative. The compositions may take such forms as suspensions, solutions or 
emulsions in oily or aqueous vehicles, and may contain formulatory agents such as suspending, 
stabilizing and/or dispersing agents. 

Pharmaceutical formulations for parenteral administration include aqueous solutions of 
the active compounds in water-soluble form. Additionally, suspensions of the active compounds 
may be prepared as appropriate oily injection suspensions. Suitable lipophilic solvents or 
vehicles include fatty oils such as sesame oil, or synthetic fatty acid esters, such as ethyl^leate or 
triglycerides, or liposomes. Aqueous injection suspensions may contain substances which 
increase the viscosity of the suspension, such as sodium carboxymethyl cellulose, sorbitol, or 
dextran. Optionally, the suspension may also contain suitable stabilizers or agents which 
increase the solubility of the compounds to allow for the preparation of highly concentrated 
solutions. Alternatively, tho active ingredient may be in powder form for constitution with a 
suitable vehicle, e,g,, sterile pyrogen-free water, before use. 

The compounds may also be formulated in rectal compositions such as suppositories or 
retention enemas, e.g,, containing conventional suppository bases such as cocoa butter or other 
giycerides. In addition to the formulations described previously, the compoimds may also be 
formulated as a depot preparation. Such long acting formulations may be administered by 
implantation (for example subcutaneousiy or intramuscularly) or by intramuscular injection. 
Thus, for example, the compounds may be formulated with suitable polymeric or hydrophobic 
materials (for example as an emulsion in an acceptable oil) or ion exchange resins, or as 
sparingly soluble derivatives, for example, as a sparingly soluble salt. 

A pharmaceutical carrier for the hydrophobic compounds of the invention is a co-solvent 
system comprising benzyl alcohol, a nonpolar surfactant, a water-miscible organic polymer, and 
an aqueous phase. The co-solvent system may be the VPD co-solvent system. VPD is a solution 
of 3% w/v benzyl alcohol, 8% w/v of the nonpolar surfactant polysorbate 80, and 65% w/v 
polyethylene glycol 300, made up to volume in absolute ethanoL The VPD co-solvent system 
(VPD:5W) consists of VPD diluted 1:1 with a 5% dextrose in water solution. This co-solvent 
system dissolves hydrophobic compormds well, and itself produces low toxicity upon systemic 
administration. Naturally, the proportions of a co-solvent system may be varied considerably 
without destroying its solubility and toxicity characteristics. Furthermore, the identity of the 
co-solvent components may be varied: for example, other low-toxicity nonpolar surfactants may 
be used instead of polysorbate 80; the fraction size of polyethylene glycol may be varied; other 
biocompatible polymers may replace polyethylene glycol, e.g. polyvinyl pyrrolidone; and other 
sugars or polysaccharides may substitute for dextrose. Alternatively, other delivery systems for 
hydrophobic pharmaceutical compounds may be employed. Liposomes and emulsions are well 
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known examples of delivery vehicies or carriers for hydrophobic drugs. Certain organic^ solvents 
such as dimethylsulfoxide also may be employed, although usually at the cost of greater toxicity. 
Additionally, the compounds may be delivered using a sustained-release system, such as 
semipermeable matrices of solid hydrophobic polymers containing the therapeutic agent. 
5 Various types of sustained-release materials have been established and are well known by those 
skilled in the art. Sustained-release capsules may, depending on their chemical nature, release the 
compounds for a few weeks up to over 100 days. Depending on the chemical nature and the 
biological stability of the therapeutic reagent, additional strategies for protein or other active 
ingredient stabilization may be employed. 

1 0 The pharmaceutical compositions also may comprise suitable solid or gel phase carriers 

or excipients. Examples of such earners or excipients include but are not limited to calcium 
carbonate, calcium phosphate, various sugars, starches, cellulose derivatives, gelatin, and 
polymers such as polyethylene glycols. Many of the active ingredients of the invention may be 
provided as salts with pharmaceutically compatible counter ions. Such pharmaceutically 

1 5 acceptable base addition salts are those salts which retain the biological effectiveness and 

properties of the free acids and which are obtained by reaction with inorganic or organic bases 
such as sodium hydroxide, magnesium hydroxide, ammonia, trialkylamine, dialkylamine, 
monoalkylamine, dibasic amino acids, sodium acetate, potassium benzoate, triethanol amine and 
the like. 

20 The pharaiaceutical composition of the invention may be in the form of a complex of^the 

protein(s) or other active ingredient(s) of present invention along with protein or peptide 
antigens. The protein and/or peptide antigen will deliver a stimulatory signal to both B and T 
lymphocytes. B lymphocytes will respond to antigen through their surface immunoglobulin 
receptor. T lymphocytes will resjpond to antigen through the T cell receptor (TCR) following 

25 presentation of the antigen by MHC proteins, MHC and structurally related proteins including ' 
those encoded by class I and class U MHC genes on host cells will serve to present the peptide 
antigen(s) to T lymphocytes. The antigen components could also be supplied as purified 
MHC-peptide complexes alone or with co-sthnulatory molecules that can directly signal T cells. 
Alternatively antibodies able to bind surface immunoglobulin and other molecules on B cells as 

30 well as antibodies able to bind the TCR and other molecules on T cells can be combined with the 
pharmaceutical composition of the invention. 

The pharmaceutical composition of the invention may be in the form of a liposome in 
which protein of the present invention is combined, in addition to other pharmaceutically 
acceptable carriers, with amphipathic agents such as lipids which exist in aggregated form as 

35 micelles, insoluble monolayers, liquid crystals, or lamellar layers in aqueous solution, stitable 
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lipids for liposomal formulation include, without limitation, monoglycerides, diglycerides, 
sulfatides, lysolecithins, phospholipids, saponin, bile acids, and the like. Preparation of such 
liposomal formulations is within the level of skill in the art. as disclosed, for example, in U.S. 
Patent Nos. 4,235,871; 4,501,728; 4,837,028; and 4,737,323, all of which are incorporated 
herein by reference. 

The amount of protein or otlier active ingredient of the present invention in the ' 
pharmaceutical composition of the present invention will depend upon the nature and severity of 
the condition being treated, and on the nature of prior treatments which the patient has 
undergone. Ultimately, the attending physician will decide the amount of protein or other active 
ingredient of the present invention with which to treat each individual patient. Initially, the 
" attending physician will administer low doses of protein or other active ingredient of the present 
invention and observe the patient's response. Larger doses of protein or other active ingredient 
of the present invention may be admmistered until the optimal therapeutic effect is obtained for 
the patient, and at that point the dosage is not increased further. It is contemplated that the 
1 5 various pharmaceutical compositions used to practice the method of the present mvention should 
contain about 0.01 ^g to about 100 mg (preferably about 0.1 ng to about 10 mg, more preferably 
about 0.1 ng to about I mg) of protein or other active ingredient of the present invention per kg 
body weight. For compositions of the present invention which are usefiil for bone, cartilage, 
tendon or ligament regeneration, the therapeutic method includes administering the composition 
topically, systematically, or locally as an implant or device. When administered, the therapliutic 
composition for use in this invention is, of course, in a pyrogen-free, physiologically acceptable 
form. Further, the composition may desu-ably be encapsulated or injected in a viscous form for 
delivery to the site of bone, cartilage or tissue damage. Topical administration may be suitable 
for wound healing and tissue repair. Therapeutically useful agents other than a protein or other 
active ingredient of the invention which may also optionally be included in the composition as 
described above, may alternatively or additionally, be administered simultaneously or 
sequentially with the composition in the methods of the invention. Preferably for bone and/or 
cartilage formation, the composition would include a matrix capable of delivering the 
protein-containing or other active ingredient-containing composition to the site of bone and/or 
cartilage damage, providing a structure for the developing bone and cartilage and optimaUy 
capable of being resorbed into the body. Such matrices may be formed of materials presently in 
use for other implanted medical applications. 

The choice of matrix material is based on biocompatibility, biodegradability, mechanical 
properties, cosmetic appearance and interface properties. The particular application of the 
compositions will define the appropriate formulation. Potential matrices for the compositions 
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may be biodegradable and chemically defined calcium sulfate, tricalcium phosphate, 
hydroxyapatite, polylactic acid, polyglycolic acid and poly anhydrides. Other potential materials 
are biodegradable and biologically well-defined, such as bone or dermal collagen. Further 
matrices are comprised of pure proteins or extracellular matrix components. Other potential 
5 matrices are nonbiodegradable and chemically defmed, such as sintered hydroxyapatite, bioglass, 
aluminates, or other ceramics. Matrices may be comprised of combinations of any of the 'above 
mentioned types of material, such as polylactic acid and hydroxyapatite or collagen and * 
tricalcium phosphate. The bioceramics may be altered in composition, such as in ^ 
calcium-aluminate-phosphate and processing to alter pore size, particle size, particle shape, and 
1 0 biodegradability . Presently preferred is a 50:50 (mole weight) copolymer of lactic acid and 

glycolic acid in the form of porous particles having diameters ranging from 150 to 800 microns. 
In some applications, it will be useful to utilize a sequestering agent, such as carboxymethyl 
cellulose or autologous blood clot, to prevent the protein compositions from disassociating from 
the matrix. 

15 A preferred family of sequestering agents is cellulosic materials such as alkylcelluloses 

(including hydroxyalkylcelluloses), including methylcellulose, ethylcelluiose, 
hydroxyethylcellulose, hydroxypropylcellulose, hydroxypropyl-methylcellulose, and 
carboxymethylcellulose, the most preferred being cationic salts of carboxymethylcellulose 
(CMC). Other preferred sequestering agents include hyaluronic acid, sodium alginate, 

20 poly(ethylene glycol), polyoxyethylene oxide, carboxyvinyl polymer and poly(vinyl alcohol^.' 
The amount of sequestering agent useful herein is 0.5-20 wt %, preferably 1-10 wt % based on 
total formulation weight, which represents the amount necessary to prevent desoiption of the 
protein from the polymer matrix and to provide appropriate handling of the composition, yet not 
so much that the progenitor cells are prevented from infiltrating the matrix, thereby providing the 

25 protein the opportunity to assist the osteogenic activity of the progenitor cells. In further 

compositions, proteins or other active ingredients of the invention may be combined with other 
agents beneficial to the treatment of the bone and/or cartilage defect, wound, or tissue in 
question. These agents include various growth factors such as epidermal growth factor (EOF), 
platelet derived growth factor (PDGF), transforming growth factors (TGF-a and TGF-P), and 

30 insulin-like growth factor (IGF). 

The therapeutic compositions are also presently valuable for veterinary applications. 
Particularly domestic animals and thoroughbred horses, in addition to humans, are desired 
patients for such treatment with proteins or other active ingredients of the present invention. The 
dosage regimen of a protein-containing pharmaceutical composition to be used in tissue 

35 regeneration will be determined by the attending physician considering various factors wliich 
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modify the action of the proteins, e.g,, amount of tissue weight desired to be formed, theS site of 
damage, the condition of the damaged tissue, the size of a wound, type of damaged tissue (e.^., 
bone), the patient's age, sex, and diet, the severity of any infection, time of administration and 
other chnical factors. The dosage may vary with the type of matrix used in the reconstitution and 
5 with inclusion of other proteins in the pharmaceutical composition. For example, the addition of 
other known growth factors, such as IGF I (insulin like growth factor I), to the final composition, 
may also effect the dosage. Progress can be monitored by periodic assessment of tissue/Jbone 
growth and/or repair, for example, X-rays, histomorphometric determinations and tetracycline 
labeling. 

^0 Polynucleotides of the present invention can also be used for gene therapy. Such 

polynucleotides can be introduced either in vivo or ex vivo into cells for expression in a 
mammalian subject. Polynucleotides of the invention may also be administered by other known 
methods for introduction of nucleic acid into a cell or organism (including, without hmitation, in 
the fomi of viral vectors or naked DNA). Cells may also be cultured ex vivo in the presence of 

1 5 proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity m such cells. Treated cells can then be introduced in vivo for therapeutic purposes, 

4,12.3 EFFECTIVE DOSAGE 

Pharmaceutical compositions suitable for use in the present invention include 

20 compositions wherein the active ingredients are contained in an effective amount to achieve^ts 
intended purpose. More specifically, a therapeutically effective amount means an amount 
effective to prevent development of or to alleviate the existing symptoms of the subject being 
treated. Determination of the effective amount is well within the capability of those skilled in 
the art, especially in light of the detailed disclosure provided herein. For any compound used in 

25 the method of the invention, the therapeutically effective dose can be estimated initially from 

appropriate in vitro assays. For example, a dose can be formulated in animal models to achieve a 
ch-culating concentration range that can be used to more accurately determine useful doses in 
humans. For example, a dose can be formulated in animal models to achieve a circulating 
concentration range that includes the ICso as determined in cell culture {Le., the concentration of 

30 the test compound which achieves a half-maximal inhibition of the protein's biological activity). 
Such information can be used to more accurately determine useful doses in humans. 

A therapeutically effective dose refers to that amount of the compound that results in 
amelioration of symptoms or a prolongation of survival in a patient. Toxicity and therapeutic 
efficacy of such compounds can be determmed by standard pharmaceutical procedures in cell 

35 cultures or experimental anhnals, e,g,, for determining the LD50 (the dose lethal to 50% of' the 
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population) and the ED50 (the dose therapeutically effective in 50% of the population). The dose 
ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the 
ratio between LD50 and ED50. Compounds which exhibit high therapeutic indices are preferred. 
The data obtained from these cell culture assays and animal studies can be used in formulating a 
5 range of dosage for use in human. The dosage of such compounds lies preferably within a range 
of circulating concentrations that include the ED50 with little or no toxicity. The dosage may 
vary within this range depending upon the dosage form employed and the route of administration 
utilized. The exact formulation, route of administration and dosage can be chosen by the" 
mdividual physician in view of the patient's condition. See, e.g,, Fmgl et aL, 1975, in "The 
10 Pharmacological Basis of Therapeutics", Ch. 1 p.l. Dosage amount and interval may be adjusted 
individually to provide plasma levels of the active moiety which are sufficient to maintain the 
desired effects, or minimal effective concentration (MEC). The MEC will vary for each 
compound but can be estimated from in vitro data. Dosages necessary to achieve the MEC will 
depend on individual characteristics and route of administration. However, HPLC assays or 
1 5 bioassays can be used to determine plasma concentrations. 

Dosage intervals can also be determmed using MEC value. Compounds should be 
administered using a regimen which maintains plasma levels above the MEC for 10-90% of the 
time, preferably between 30-90% and most preferably between 50-90%. In cases of local 
administration or selective uptake, the effective local concentration of the drug may not be 
20 related to plasma concentration. 

S 

An exemplary dosage regimen for polypeptides or other compositions of the invention 
will be in the range of about 0,01 ^g/kg to 100 mg/kg of body weight daily, with the preferred 
dose being about 0.1 iig/kg to 25 mg/kg of patient body weight daily, varying in adults and 
children. Dosing may be once daily, or equivalent doses may be delivered at longer or shorter 
25 intervals. 

The amount of composition administered will, of course, be dependent on the subject 
being treated, on the subject's age and weight, the severity of the affliction, the manner of 
administration and the judgment of the prescribing physician. 

30 4.12.4 PACKAGING 

The compositions may, if desired, be presented in a pack or dispenser device which may 
contain one or more unit dosage forms contaming the active ingredient The pack may, for 
example, comprise metal or plastic foil, such as a blister pack. The pack or dispenser device may 
be accompanied by instructions for administration. Compositions comprismg a compound of the 

i 
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invention formulated in a compatible pharmaceutical carrier may also be prepared, placed in an 
appropriate container, and labeled for treatment of an indicated condition. 



4.13 ANTIBODIES 

5 Also included in the invention are antibodies to proteins, or J&agments of proteins of the 

invention. The term "antibody" as used herein refers to immunoglobulin molecules and 
immunologically active portions of unmunoglobulin (Ig) molecules, i.e., molecules that contain 
an antigen binding site that specifically binds (immunoreacts with) an antigen. Such antibodies 
include, but are not limited to, polyclonal, monoclonal, chimeric, single chain. Fab, Fab* and F(ab02 
10 fragments, and an Fab expression library. In general, an antibody molecule obtained from 

humans relates to any of the classes IgG, IgM, IgA, IgE and IgD, w^hich differ from one another 
by the nature of the heavy chain present in the molecule. Certain classes have subclasses as well, 
such as IgGi, IgG2, and others. Furthermore, in humans, the light chain may be a kappa chain or 
a lambda chain. Reference herein to antibodies includes a reference to all such classes, 

1 5 subclasses and types of human antibody species. 

An isolated related protein of the invention may be intended to serve as an antigen, or a 
portion or fragment thereof, and additionally can be used as an immimogen to generate 
antibodies that immunospecifically bind the antigen, using standard techniques for polyclonal 
and monoclonal antibody preparation. The full-length protein can be used or, alternatively, the 

20 invention provides antigenic peptide fragments of the antigen for use as immunogens. An - 
antigenic peptide fragment comprises at least 6 amino, acid residues of the amino acid sequence 
of the full length protein, such as an amino acid sequence shown in SEQ ID NO: 1787, and 
encompasses an epitope thereof such that an antibody raised against the peptide forms a specific 
immune complex with the full length protein or with aay fragment that contains the epitope. 

25 Preferably, the antigenic peptide comprises at least 1 0 amino acid residues, or at least 1 5 ammo 
acid residues, or at least 20 amino acid residues, or at least 30 amino acid residues. Preferred 
epitopes encompassed by the antigenic peptide are regions of the protein that are located on its 
surface; commonly these are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 

30 antigenic peptide is a region of -related protein that is located on the surface of the protein, e.g., a 
hydrophiHc region. A hydrophobicity analysis of the human related protein sequence will 
indicate which regions of a related protein are particularly hydrophilic and, therefore, are likely 
to encode surface residues usefiil for targeting antibody production. As a means for targeting 
antibody production, hydropathy plots shoudng regions of hydropliilicity and hydrophobicity 

35 may be generated by any method well known in the art, including, for example, the Kyte 
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Doolittle or the Hopp Woods methods, either with or without Fourier transformatioa. See, e.g., 
Hopp and Woods, 1981, Proc, Nat Acad ScL USA 78: 3824-3828; Kyte and Doolittle 1982, J 
MoL BioL 1 57: 105-142, each of which is incorporated herein by reference in its entirety. 
Antibodies that are specific for one or more domains within an antigenic protein, or derivatives, 
5 fragments, analogs or homo logs thereof, are also provided herein. 

A protein of the invention, or a derivative, fragment, analog, homolog or ortholog' 
thereof, may be utilized as an immunogen in the generation of antibodies that 
imraunospecifically bind these protein components. 

Various procedures known within the art may be used for the production of polyclonal or 
1 0 monoclonal antibodies directed against a protein of the invention, or against derivatives, ^ 
fragments, analogs homologs or orthoiogs thereof (see, for example. Antibodies: A Laboratory 
Manual, Harlow E, and Lane D, 1988, Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, NY, incorporated herein by reference). Some of these antibodies are discussed below. 

15 5.13.1 Polyclonal Antibodies 

For the production of polyclonal antibodies, various suitable host animals (e.g., rabbit, 
goat, mouse or other mammal) may be immunized by one or more injections with the native 
protein, a synthetic variant thereof, or a derivative of the foregoing. An appropriate 
inununogenic preparation can contain, for example, the naturally occurring immunogenic 

20 protein, a chemically synthesized polypeptide representing the immunogenic protein, or a ' 
recombinantly expressed immimogenic protein. Furthermore, the protein may be conjugated to 
a second protein known to be immunogenic in the mammal being immunized. Examples of such 
immunogenic proteins include but are not limited to keyhole limpet hemocyanin, serum albimiin, 
bovine thyroglobulin, and soybean trypsin inhibitor. The preparation can further include an 

25 adjuvant. Various adjuvants used to increase the inmiunological response include, but are not 
limited to, Freund's (complete and incomplete), mineral gels (e.g., eiluminum hydroxide), surface 
active substances (e.g., lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, 
dinitrophenol, etc.), adjuvants usable in humans such as Bacille Calmette-Guerin and 
Corynebacterium parvum, or similar immunostimulatory agents. Additional examples of 

30 adjuvants which can be employed include MPL-TDM adjuvant (monophosphoryl Lipid A, 
synthetic trehalose dicorynomycolate). 

The polyclonal antibody molecules directed against the immunogenic protein can be 
isolated from the mammal (e.g., from the blood) and further purified by well known techniques, 
such as affmity chromatography using protein A or protein G, which provide primarily the IgG 

35 fraction of immune serum. Subsequently, or altematively, the specific antigen which is trfe 
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target of the immunoglobulin sought, or an epitope thereof, may be immobilized on a eolunm to 
purify tlie immune specific antibody by immunoaffinity chromatography. Purification of 
immunoglobulins is discussed, for example, by D. Wiildnson (The Scientist, published by The 
Scientist, Inc., Philadelphia PA, Vol. 14, No. 8 (April 17, 2000), pp. 25-28). 

5.13.2 Monoclonal Antibodies 

The term "monoclonal antibody" (MAb) or "monoclonal antibody composition"^ as used 
herein, refers to a population of antibody molecules that contain only one molecular species of 
antibody molecule consisting of a unique light chain gene product and a unique heavy chain 
gene product. In particular, the complementarity determining regions (CDRs) of the monoclonal 
antibody are identical in all the molecules of the population. MAbs thus contain an antigen 
binding site capable of immunoreacting with a paiticular epitope of the antigen characterized by 
a unique binding affinity for it. 

Monoclonal antibodies can be prepared using hybridoma methods, such as those 
described by Kohler and Milstem, Nature, 256:495 (1975). In a hybridoma method, a mouse, 
hamster, or other appropriate host animal, is typically immunized with an immunizing agent to 
elicit lymphocytes that produce or are capable of producing antibodies that will specifically bind 
to the immunizing agent. Alternatively, the lymphocytes can be immunized in vitro. 
The immunizing agent will typically include the protein antigen, a firagment thereof or a fiision 
protein thereof. Generally, either peripheral blood lymphocytes are used if cells of human otigin 
are desired, or spleen cells or lymph node cells are used if non-human mammalian sources are 
desired. The lymphocytes are then fused with an immortalized cell line using a suitable fusing 
agent, such as polyethylene glycol, to form a hybridoma cell (GodLng, Monoclonal Antibodies: 
P_rinciples and Practice . Academic Press, (1986) pp. 59-103). Immortalized cell lines are usually 
transformed mammalian cells, particularly myeloma cells of rodent, bovine and human origin. 
Usually, rat or mouse myeloma cell lines are employed. The hybridoma cells can be cultured in 
a suitable culture medium that preferably contains one or more substances that inhibit the growth 
or survival of the unfused, immortalized cells. For example, if the parental cells lack the enzyme 
hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the culture medium for 
the hybridomas typically will include hypoxanthine, aminopterin, and thymidine ("HAT 
medium"), which substances prevent the growth of HGPRT-deficient cells. 

Prefeired immortalized cell lines are those that fuse efficiently, support stable high level 
expression of antibody by the selected antibody-producing cells, and are sensitive to a medium 
such as HAT medium. More preferred immortalized cell lines are murine myeloma lines, which 
can be obtained^ for instance, from the Salk Institute Cell Distribution Center, San Diego, 
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California and the American Type Culture Collection, Manassas, Virginia. Human myeloma and 
mouse-human heteromyeioma cell lines also have been described for the production of human 
monoclonal antibodies (Kozbor, J. Immunol. , 133:3001 (1984); Brodeur et al.. Monoclonal 
Antibody Production Techniques and Applications. Marcel Dekker, Inc., New York, (1987) pp. 
5 51-63). 

The culture medium in which the hybridoma cells are cultured can then be assayed for 
the presence of monoclonal antibodies directed against the antigen. Preferably, the binding 
specificity of monoclonal antibodies produced by the hybridoma cells is determined by 
immmioprecipitation or by an in vitro binding assay, such as radioimmunoassay (RIA) or 
10 enzyme-linked iniraunoabsorbent assay (ELISA). Such techniques and assays are known in the 
art. The binding affinity of the monoclonal antibody can, for example, be determined by the 
Scatchard analysis of Munson and Pollard, Anal Biochcm.. 107:220 (1980). Preferably, 
antibodies having a high degree of specificity and a high binding affinity for the target antigen 
are isolated. 

1 5 After the desired hybridoma cells are identified, the clones can be subcloned by limiting 

dilution procedures and grown by standard methods. Suitable culture media for this purpose 
include, for example, Dulbecco*s Modified Eagle's Medixom and RPMI-1640 medium. 
Alternatively, the hybridoma cells can be grown in vivo as ascites in a mammal. 
The monoclonal antibodies secreted by the subclones can be isolated or purified from the culture 

20 medium or ascites fluid by conventional immunoglobulin purification procedures such as, fo^ V 
example, protein A-Sepharose, hydroxylapatite chromatography, gel electrophoresis, dialysis, or 
affinity chromatography. 

The monoclonal antibodies can also be made by reconibinant DNA methods, such as 
those described in U.S. Patent No. 4,8 1 6,567. DNA encoding the monoclonal antibodies of the 

25 invention can be readily isolated and sequenced using conventional procedures (e.g., by using 
oligonucleotide probes that are capable of binding specifically to genes encoding the heavy and 
light chains of murine antibodies). The hybridoma cells of the invention serve as a preferred 
source of such DNA. Once isolated, the DNA can be placed into expression vectors, which are 
then transfected into host cells such as simian COS cells, Chinese hamster ovary (CHO) cells, or 

30 myeloma cells that do not otherv^se produce immunoglobiilin protein, to obtain the synthesis of 
monoclonal antibodies in the recombinant host cells. The DNA also can be modified, for 
example, by substituting the coding sequence for human heavy and Hght chain constant domains 
in place of the homologous murine sequences (U.S. Patent No. 4,816,567; Morrison, Nature 368, 
812-13(1 994)) or by covaJently joining to the immunoglobulin coding sequence all or part of the 

35 coding sequence for a non-inununoglobulin polypeptide. Such a non-immunoglobulin * 
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polypeptide can be substituted for the constant domains of an antibody of the invention; or can 
be substituted for the variable domains of one antigen-combining site of an antibody of the 
invention to create a chimeric bivalent antibody. 

5 5.13,2 Humanized Antibodies 

The andbodies directed against the protein antigens of the invention can further comprise 
humanized antibodies or human antibodies. These antibodies are suitable for administration to 
humans without engendering an immune response by the human against the administered 
immunoglobulin. Humanized forms of antibodies are chimeric immunoglobulins, 
1 0 immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab', F(ab')2 or other antigen- 
binding subsequences of antibodies) that are principally comprised of the sequence of a human 
unmunoglobulin, and contain minimal sequence derived from a non-human immunoglobulin. 
Humanization can be performed following the method of Winter and co-workers (Jones et al„ 
Nature, 321 :522-525 (1986); Riechmann et aL, Nature, 332:323-327 (1988); Verhoeyen et al., 
15 Science, 239:1534-1536 (1988)), by substituting rodent CDRs or CDR sequences for the 

corresponding sequences of a human antibody. (See also U.S. Patent No. 5,225,539.) In some 
instances, Fv framework residues of the human immunoglobulm are replaced by corresponding 
non-human residues. Humanized antibodies can also comprise residues which are found neither 
in the recipient antibody nor in the imported CDR or framework sequences. In general, the 
20 humanized antibody will comprise substantially all of at least one, and typically two, variable 

domains, in which all or substantially all of the CDR regions correspond to those of a non-human 
immunoglobulin and ail or substantially all of the framework regions are those of a human 
immunoglobulin consensus sequence. The humanized antibody Optimally also will comprise at 
least a portion of an immunoglobulin constant region (Fc), typically that of a human 
25 immunoglobulin (Jones et al., 1986; Riechmann et al., 1988; and Presta, Cuix. Op, Struct Biol.. 
2:593-596 (1992)), 

5.13.3 Human Antibodies 

Fully human antibodies relate to antibody molecules in which essentially the entire 
30 sequences of both the light chain and the heavy chain, including the CDRs, arise from human 
genes. Such antibodies are termed "human antibodies", or "fully human antibodies" herein. 
Human monoclonal antibodies can be prepared by the trioma technique; the human B-cell 
hybridoma technique (see Kozbor, et al, 1983 Immunol Today 4: 72) and the EBV hybridoma 
technique to produce human monoclonal antibodies (see Cole, et aL, 1985 In: MONOCLONAL 
35 Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96), Human monoclonal ^ 
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antibodies may be utilized in the practice of the present invention and may be producedby using 
human hybridomas (see Cote, et al., 1983. Proc Natl Acad Sci USA 80: 2026-2030) or by 
transforming human B-cells with Epstein Barr Virus in vitro (see Cole, et al., 1985 In: 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 
5 In addition, human antibodies can also be produced using additional techniques, 

including phage display libraries (Hoogenboom and Winter, J. Mol. Biol. . 227:381 (1991); 
Marks et al., J. Mol. Biol. , 222:581 (1991)). Similarly, humau antibodies can be made b^ ' 
introducing human immunoglobulin loci into transgenic animals, e.g., mice in which the 
endogenous immunoglobulin genes have been partially or completely inactivated. Upon 
10 chaUenge, human antibody production is observed, which closely resembles that seen in humans 
in all respects, including gene rearrangement, assembly, and antibody repertoire. This approach 
is described, for example, in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 
5,633,425; 5,661.016, and in Marks et al. (Bio/Technolop v 10, 779-783 (1992)); Lonberg et al. 
(Nature 368 856-859(1994)); Morrison ( Nature 368, 812-13 (1994)); Fishwild et al,( Nature 
Biotechnology 14, 845-51 (1996)); Neuberger (Nature Biotechtinlnpy 14, 826 (1996)); and 
Lonberg and Huszar (Intern. Rev.. Immunol. 13 65-93 (1995)). 

Human antibodies may additionally be produced using transgenic nonhuman animals 
which are modified so as to produce fiiUy human antibodies rather than the animal's endogenous 
antibodies in response to challenge by an antigen. (See PCT publication WO94/02602). The 
endogenous genes encoding the heavy and light immunoglobulin chains in the nonhuman h^t 
have been incapacitated, and active loci encoding human heavy and light chain immunoglobulins 
are inserted into the host's genome. The human genes are incorporated, for example, using yeast 
artificial chromosomes containing the requisite human DNA segments. An animal which 
provides aU the desired modifications is then obtained as progeny by crossbreeding intermediate 
25 transgenic animals containing fewer than the full complement of the modifications. The 

preferred embodiment of such a nonhuman animal is a mouse, and is termed the Xenomouse™ 
as disclosed in PCT publications WO 96/33735 and WO 96/34096. This animal produces B cells 
which secrete fully human immunoglobulins. The antibodies can be obtained directly from the 
animal after immunization with an immunogen of interest, as, for example, a preparation of a 
30 polyclonal antibody, or alternatively from immortalized B cells derived from the animal, such as 
hybridomas producing monoclonal antibodies. Additionally, the genes encoding the 
immunoglobulins with human variable regions can be recovered and expressed to obtain the 
antibodies directly, or can be finrther modified to obtain analogs of antibodies such as, for 
example, single chain Fv molecules. 
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An example of a method of producing a nonhuman host, exemplified as a mouse/lacking 
expression of an endogenous immunoglobulin heavy chain is disclosed in U.S. Patent No. 
5,939,598- It can be obtained by a method including deleting the J segment genes jfrom at least 
one endogenous heavy chain locus in an embryonic stem cell to prevent rearrangement of the 
5 locus and to prevent formation of a transcript of a rearranged immunoglobulin heavy chain locus, 
the deletion being effected by a targeting vector containing a gene encoding a selectable marker; 
and producing from the embryonic stem cell a transgenic mouse whose somatic and germ cells 
contain the gene encoding the selectable marker. 

A method for producing an antibody of interest, such as a human antibody, is disclosed in 

10 U.S. Patent No, 5,916,771. It includes introducing an expression vector that contains a 

nucleotide sequence encoding a heavy chain into one mammalian host cell in culture, introducing 
an expression vector containing a nucleotide sequence encoding a light chain into another 
mammalian host cell, and fusing the two cells to form a hybrid cell. The hybrid cell expresses an 
antibody containing the heavy chain and the light chain. 

15 In a further improvement on this procedure, a method for identifying a clinically relevant 

epitope on an immunogen, and a correlative method for selecting an antibody that binds 
immunospecifically to the relevant epitope with high affmity, are disclosed in PCX publication 
WO 99/53049. 



20 5.13.4 Fftb Fragments and Single Chain Antibodies 

According to the invention, techniques can be adapted for the production of single-chain 
antibodies specific to an antigenic protein of the invention (see e.g., U.S. Patent No. 4,946,778). 
In addition, methods can be adapted for the construction of Fab expression libraries (see e.g., 
Huse, et al., 1989 Science 246: 1275-1281) to allov^ rapid and effective identification of 

25 monoclonal Fat fragments with the desired specificity for a protein or derivatives, fragments, 

analogs or homologs thereof. Antibody fragments that contain the idiotypes to a protein antigen 
may be produced by techniques known in the art including, but not limited to: (i) an F(ab')2 
fragment produced by pepsin digestion of an antibody molecule; (ii) an Fab fragment generated 
by reducing the disulfide bridges of an F(ab')2 fragment; (iii) an Fab fragment generated by the 

30 treatment of the antibody molecule with papain and a reducing agent and (iv) Fy fragments. 

5,13.5 Bispecific Antibodies 

Bispecifzc antibodies are monoclonal, preferably human or humanized, antibodies that 
have binding specificities for at least two different antigens. In the present case, one of the 
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binding specificities is for an antigenic protein of the invention. The second binding target is any 
other antigen, and advantageously is a cell-surface protein or receptor or receptor subimit. 

Methods for making bispecific antibodies are known in the art. Traditionallyj the 
recombinant production of bispecific antibodies is based on the co-expression of two 
5 immunoglobulin heavy-chain/iight-chain pairs, where the two heavy chains have different 
specificities (Milstein and Cuello, Nature, 305:537-539 (1983)). Because of the random '■ 
assortment of hmnunoglohvlm heavy and light chains, these hybridomas (quadromas) produce a 
potential mixture of ten different antibody molecules, of which only one has the correct . 
bispecific structure. The purification of the correct molecule is usually accomplished by affinity 

10 chromatography steps. Similar procedures axe disclosed in WO 93/08829, published 13 May 
1993, andinTraunecker^/i2/., 1991 EMBOJ., 10:3655-3659. 

Antibody variable domains with the desired binding specificities (antibody-antigen 
combining sites) can be fused to immunoglobulin constant domain sequences. The fiision 
preferably is v«th an immunoglobulin heavy-chain constant domain, comprising at least part of 

1 5 the hinge, CH2, and CH3 regions. It is preferred to have the first heavy-chain constant region 
(CHI) containing the site necessary for light-chain binding present in at least one of the fiisions, 
DNAs encoding the immtmoglobulin heavy-chain fusions and, if desured, the immunoglobulin 
light chain, are inserted into separate expression vectors, and are co-transfected into a suitable 
host organism. For further details of generating bispecific antibodies see, for example, Suresh et 

20 al.. Methods in Enzvmology, 121:210 (1986). ^ 
According to another approach described in WO 96/270 1 1 , the interface between a pair 
of antibody molecules can be engineered to maximize the percentage of heterodimers which are 
recovered from recombinant cell culture. The preferred interface, comprises at least a part of the 
CH3 region of an antibody constant domain. In this method, one or more small amino acid side 

25 chains fi:om the interface of the first antibody molecule are replaced with larger side chains (e,g. 
tyrosine or tryptophan). Compensatory "cavities" of identical or similar size to the large side 
chain(s) are created on the interface of the second antibody molecule by replacing large amino 
acid side chains with smaller ones (e.g. alanine or threonine). This provides a mechanism for 
increasing the yield of the heterodimer over other unwanted end-products such as homodimers, 

30 Bispecific antibodies can be prepared as full length antibodies or antibody fi:agments (e.g. 

F(ab')2 bispecific antibodies). Techniques for generating bispecific antibodies firom antibody 
fragments have been described in the literature. For example, bispecific antibodies can be 
prepared using chemical linkage. Bremian et al., Science 229:81 (1985) describe a procedure 
wherein intact antibodies are proteolytically cleaved to generate F(ab')2 fragments. These 

35 fragments are reduced in the presence of the dithiol complexing agent sodiimi arsenite td^ 
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Stabilize vicinal dithiois and prevent intermolecular disulfide formation. The Fab' fragments 
generated are then converted to thionitrobenzoate (TNB) derivatives. One of the Fab'-TNB 
derivatives is then reconverted to the Fab '-thiol by reduction with raercaptoetfaylamine and is 
mixed with an equimolar amount of the other Fab'-TNB derivative to form the bispecific 
antibody. The bispecific antibodies produced can be used as agents for the selective 
inunobilization of enzymes. 

Additionally, Fab' fragments can be directly recovered from E. coli and chemicaJly 
coupled to form bispecific antibodies. Shalaby et al., J. Exp. Med. 175:217-225 (1992) describe 
the production of a fully humanized bispecific antibody F(ab')2 molecule. Each Fab' fragment 
was separately secreted from E. coli and subjected to directed chemical coupling in vitro to form 
the bispecific antibody. The bispecific antibody thus formed was able to bind to cells 
overexpressing the ErbB2 receptor and normal human T cells, as well as trigger the lytic activity 
of human cytotoxic lymphocytes against human breast tumor targets. 

Various techniques for making and isolating bispecific antibody fragments directly from 
recombinant cell culture have also been described. For example, bispecific antibodies have been 
produced using leucine zippers. Kostelny et al., J. Immunol. 148(5): 1547- 1553 (1992). The 
leucine zipper peptides from the Fos and Jun proteins were linked to the Fab' portions of two 
different antibodies by gene fusion. The antibody homodimers were reduced at the hinge region 
to form monomers and then re-oxidized to form the antibody heterodimers. This method can 
also be utilized for the production of antibody homodimers. The "diabody" technology 
described by Hollinger et al., Proc. Natl. Acad. Sci. USA 90:6444-6448 (1993) has provided an 
alternative mechanism for making bispecific antibody fragments. The fragments comprise a 
heavy-chain variable domain (Vh) connected to a light-chain variable domain (VO by a linker 
which is too short to allow pairing between the two domains on the same chain. Accordingly, 
the Vh and Vl domains of one fragment are forced to pair with the complementary Vl and Vh 
domams of another fragment, thereby forming two antigen-binding sites. Another strategy for 
making bispecific antibody fragments by the use of single-chain Fv (sFv) dimers has also been 
reported. See, Gruber et al, J. frnmunol. 152:5368 (1994). 

Antibodies with more than two valencies are contemplated. For example, trispecific 
antibodies can be prepared, Tutt et aL, J. Immunol. 147:60 (1991), 
Exemplary bispecific antibodies can bind to two different epitopes, at least one of which 
originates in the protein antigen of the invention. Alternatively, an anti-antigenic arm of an 
immunoglobulin molecule can be combined with an arm which binds to a triggering molecule on 
a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3, CD28, or B7), or Fc receptors for 
IgG (FcyR), such as FcyRI (CD64), FcyRII (CD32) and FcylUII (CD16) so as to focus ceflular 
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defense mechanisms to the cell expressing the particular antigen. Bispecific antibodies s^m also 
be used to direct cytotoxic agents to cells which express a particular antigen. These antibodies 
possess an antigen-binding arm and an arm which binds a cytotoxic agent or a radionuclide 
chelator, such as EOTUBE, DPTA, DOTA, or TETA. Another bispecific antibody of interest 
5 " binds the protein antigen described herein and further binds tissue factor (TF). 



5,13.6 Heteroconjugate Antibodies 

Heteroconjugate antibodies are also within the scope of the present invention. 
Heteroconjugate antibodies are composed of two covaiently joined antibodies. Such antibodies 

1 0 have, for example, been proposed to target immune system cells to unwanted cells (U.S. Patent 
No. 4,676,980), and for treatment of HIV infection (WO 91/00360; WO 92/200373; BP 03089). 
It is contemplated that the antibodies can be prepared in vitro using known methods in synthetic 
protein chemistry, including those involving crosslinking agents. For example, immunotoxins 
can be constructed using a disulfide exchange reaction or by forming a thioether bond. 

1 5 Examples of suitable reagents for this purpose include inunothiolate and methyI-4- 
mercaptobutyrimidate and those disclosed, for example, in U.S. Patent No. 4,676,980. 



5.13.7 Effector Function Engineering 

It can be desirable to modify the antibody of the invention with respect to effector function, so as 
20 to enhance, e.g., the effectiveness of the antibody in treating cancer. For example, cysteine % ^' 
residue(s) can be introduced into the Fc region, thereby allowing interchain disulfide bond 
formation in this region. The homodimeric antibody thus generated can have improved 
internalization capability and/or increased complement-mediated cell killing and antibody- 
dependent cellular cytotoxicity (ADCC). See Caron et al., X Exp Med., 176: 1191-1 195 (1992) 
25 and Shopes, J. Immunol., 148: 2918-2922 (1992). Homodimeric antibodies with enhanced anti- 
tumor activity can also be prepared using heterobifunctional cross-linkers as described in Wolff 
et al. Cancer Research, 53: 2560-2565 (1993), Alternatively, an antibody can be engineered that 
has dual Fc regions and can thereby have enhanced complement lysis and ADCC capabilities. 
See Stevenson et al,, Anti-Cancer Drug Design, 3; 219-230 (1 989). 

30 

5.13.8 Immunoconjugates 

The invention also pertains to immunoconjugates comprising an antibody conjugated to a 
cytotoxic agent such as a chemotherapeutic agent, toxin (e,g., an enzymaticaliy active toxin of 
bacterial, fungal, plant, or animal origin, or firagments thereof), or a radioactive isotope (i.e., a 
3 5 radioconj ugate), ^ 
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Chemotherapeutic agents useful in the generation of such immunoconjugates have been 
described above. Enzymatically active toxins and fragments thereof that can be used include 
diphtlieria A chain, nonbmding active fragments of diphtheria toxin, exotoxin A chain (from 
Pseudomonas aeruginosa), ricin A chain, abrin A chain, modeccin A chain, alpha-sarcin, 
5 Aleurites fordii proteins, dianthin proteins, Phytolaca americana proteins (PAPI, P API!, and 
PAP-S). momordica charantia inhibitor, curcin, crotin, sapaonaria oflQcinalis inhibitor, gelonin, 
mitogellin, restrictocin, phenomycin, enomycin, and the tricothecenes. A variety of ; 
radionuclides are available for the production of radioconjugated antibodies. Examples include 
'''ln,^°Y,and'««Re. 

10 Conjugates of the antibody and cytotoxic agent are made using a variety of bifunctional 

protein-coupling agents such as N-succinimidyl-3-(2-pyridyldithiol) propionate (SPDP), 
iminothiolane (IT), bifunctional derivatives of iraidoesters (such as dimethyl adipimidate HCL), 
active esters (such as disuccinimidyl suberate), aldehydes (such as glutareldehyde), bis-azido 
compounds (such as bis (p-azidobenzoyl) hexanediamine), bis-diazonium derivatives (such as 

1 5 bis-(p-diazoniumbenzoyl>ethylenediamine), diisocyanates (such as tolyene 2,6-diisocyanate), 
and bis-active fluorine compounds (such as l,5-difluoro-2,4-dinitrobenzene). For example, a 
ricin iramunotoxin can be prepared as described in Vitetta et al.. Science, 238: 1098 (1987). 
Carbon-I4-labeled I-isothiocyanatoben2yI-3-methyldiethylene triaminepentaacetic acid (MX- 
DTPA) is an exemplary chelating agent for conjugation of radionucleotide to the antibody. See 
WO94/11026. 

In another embodiment, the antibody can be conjugated to a "receptor" (such 
streptavidin) for utilization in tumor pretargeting wherein the antibody-receptor conjugate is 
administered to the patient, followed by removal of unbound conjugate from the circulation 
using a clearing agent and then administration of a "ligand" (e.g., avidin) that is in turn 
conjugated to a cytotoxic agent. 

4.14 COMPUTER READABLE SEQtJENCES 

In one application of this embodiment, a nucleotide sequence of the present invention can 
be recorded on computer readable media. As used herein, "computer readable media" refers to 
any medium which can be read and accessed directly by a computer. Such media include, but 
are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and 
magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM 
and ROM; and hybrids of these categories such as magnetic/optical storage media. A skilled 
artisan can readily appreciate how any of the presently known computer readable medium| can 
be used to create a manufacture comprising computer readable medium having recorded thereon 
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a nucleotide sequence of the present invention. As used herein, "recorded" refers to a ptocess for 
storing information on computer readable medium. A skilled artisan can readily adopt any of the 
presently known methods for recording information on computer readable medium to generate 
manufactures comprising the nucleotide sequence information of the present invention. 
5 A variety of data storage structures are available to a skilled artisan for creating a 

computer readable medium having recorded thereon a nucleotide sequence of the present ' 
invention. The choice of the data storage structure will generally be based on the means chosen 
to access the stored information. In addition, a variety of data processor programs and fomiats 
can be used to store the nucleotide sequence information of the present invention on computer 
10 readable medium. The sequence information can be represented in a word processing text file, 
formatted in commercially-available software such as WordPerfect and Microsoft Word, or 
represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, 
Oracle, or the like. A skilled artisan can readily adapt any number of data processor structuring 
formats (e.g, text file or database) in order to obtain computer readable medium having recorded 
1 5 thereon the nucleotide sequence information of the present invention. 

By providing any of the nucleotide sequences SEQ ID NO:l-1786 and 3573-5358 or a 
representative fragment thereof; or a nucleotide sequence at least 95% identical to any of the 
nucleotide sequences of SEQ ID NO: 1-1786 and 3573-5358 in computer readable form, a skilled 
artisan can routinely access the sequence mformation for a variety of purposes. Computer 
software is publicly available which allows a skilled aitisan to access sequence information «^ ^' 
provided in a computer readable medium. The examples which follow demonstrate how 
software which implements the BLAST (Altschul et aL, J, MoL Biol. 215:403-410 (1990)) and 
BLAZE (Brutlag et al„ Comp. Chem. 17:203-207 (1993)) search algorithms on a Sybase system 
is used to identify open reading frames (ORPs) within a nucleic acid sequence. Such ORFs may 
be protein encoding fragments and may be useful m producing commercially important proteins 
such as enzymes used in fermentation reactions and in the production of commercially usefnl 
metabolites. 

As used herein, "a computer-based system" refers to the hardware means, software 
means, and data storage means used to analyze the nucleotide sequence information of the 
present invention. The minimum hardware means of the computer-based systems of the present 
invention comprises a central processing unit (CPU), input means, output means, and data 
storage means. A skilled artisan can readily appreciate that any one of the currently available 
computer-based systems are suitable for use in the present invention. As stated above, the 
computer-based systems of the present invention comprise a data storage means having stored 
therein a nucleotide sequence of the presmt invention and the necessary hardware means Ind 

85 



A 



wo 01/53312 PCT/USOO/34263 
software means for supporting and implementing a search means. As used herein, "datastorage 
means" refers to memory which can store nucleotide sequence information of the present 
invention, or a memory access means which can access manufactures having recorded thereon 
the nucleotide sequence information of the present invention. 

As used herein, "search means" refers to one or more programs wiiich are implemented 
on the computer-based system to compare a target sequence or target structural motif with the 
sequence information stored within the data storage means. Search means are tised to identify 
fragments or regions of a known sequence which match a particular target sequence or t^get 
motif. A variety of known algorithms are disclosed publicly and a variety of commercially 
available software for conducting search means are and can be used in the computer-based 
systems of the present invention. Examples of such software includes, but is not limited to. 
Smith- Waterman, MacPattem (EMBL), BLASTN and BLASTA (NPOLYPEPTIDEIA). A 
skilled artisan can readily recognize that any one of the available algorithms or implementing 
software packages for conducting homology searches can be adapted for use in the present 
computer-based systems. As used herein, a "target sequence" can be any nucleic acid or amino 
acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can 
readily recognize that the longer a target sequence is, the less likely a target sequence will be 
present as a random occurrence in the database. The most preferred sequence length of a target 
sequence is from about 10 to 300 amino acids, more preferably from about 30 to 100 nucleotide 
residues. However, it is well recognized that searches for commercially important fragment, 
such as sequence fragments involved in gene expression and protein processing, may be of 
shorter length. 

As used herein, "a target structural motif," or "target motif," refers to any rationally 
selected sequence or combination of sequences in which the sequence(s) are chosen based on a 
three-dimensional configuration which is formed upon the folding of the target motif. There are 
a variety of target motifs known in the art. Protein target motifs include, but are not limited to, 
enzyme active sites and signal sequences. Nucleic acid target motifs include, but are not limited 
to, promoter sequences, hairpin structures and inducible expression elements (protein binding 
sequences). 

4.15 TRIPLE HELIX FORMATION 

Jn addition, the fragments of the present invention, as broadly described, can be used to 
control gene expression through triple helix formation or antisense DNA or RNA, both of which 
methods are based on the binding of a polynucleotide sequence to DNA or RNA, 
Polynucleotides suitable for use in these methods are preferably 20 to 40 bases in length and are 
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designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et aL, Nucl, Acids Res. 6:3073 (1979); Cooney et al., Science 15241 :456 (1988); and Dervan 
et al.. Science 251:1360 (1991)) or to the mRNA itself (antisense - Olrrmo, J. Neurochem. 56:560 
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
5 Raton, FL (1 988)). Triple helix-formation optimally results in a shut-off of RNA transcription 
from UNA, while antisense RNA hybridization blocks translation of an mRJsIA molecule Into 
polypeptide. Both techniques have been demonstrated to be effective in model systems. " 
Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide. 

10 

4.16 DIAGNOSTIC ASSAYS AND KITS 

The present invention further provides methods to identify the presence or expression of 
one of the ORPs of the present invention, or homolog thereof, in a test sample, using a nucleic 
acid probe or antibodies of the present invention, optionally conjugated or otherwise associated 
15 with a suitable label. 

In general, methods for detecting a polynucleotide of the invention can comprise 
contacting a sample with a compound that binds to and forms a complex with the polynucleotide 
for a period sufScient to form the complex, and detecting the complex, so that if a complex is 
detected, a polynucleotide of the invention is detected in the sample. Such methods can also 
20 comprise contacting a sample under stringent hybridization conditions with nucleic acid prii^ers 
that anneal to a polynucleotide of the mvention under such conditions, and amplifying annealed 
polynucleotides, so that if a polynucleotide is amplified, a polynucleotide of the invention is 
detected in the sample. 

In general, methods for detecting a polypeptide of the invention can comprise contacting 
25 a sample with a compound that binds to and forms a complex with the polypeptide for a period 
sufficient to form the complex, and detecting the complex, so that if a complex is detected, a 
polypeptide of the invention is detected in the sample. 

In detail, such methods comprise incubating a test sample with one or more of the 
antibodies or one or more of the nucleic acid probes of the present invention and assaying for 
30 binding of the nucleic acid probes or antibodies to components within the test sample. 

Conditions for incubating a nucleic acid probe or antibody with a test sample vary. 
Incubation conditions depend on the format employed in the assay, the detection methods 
employed, and the type and nature of the nucleic acid probe or antibody used in the assay. One 
skilled in the art will recognize that any one of the commonly available hybridization, 
35 amplification or immunological assay formats can readily be adapted to employ the nuclefc acid 
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probes or antibodies of the present invention. Examples of such assays can be found In Chard. 
T., An Introduction to Radioimmunoassay and Related Techniques, Elsevier Science Publishers 
Amsterdam, The Netherlands (1 986); Bullock, G.R. et al.. Techniques in Immunocytochemistry' 
Academic Press, Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen. P., Practice ' 
and Theory of immunoassays: Laboratory Techniques in Biochemistiy and Molecular Biology, 
Elsevier Science Publishers, Amsterdam, The Netlieriands (1 985). The test samples of the 
present invention include cells, protein or membrane extracts of cells, or biological fluids such as 
sputum, blood, serum, plasma, or urine. The test sample used in the above-described method 
will vary based on the assay format, nature of the detection method and the tissues, cells or 
extracts used as the sample to be assayed. Methods for preparing protein extracts or membrane 
extracts of cells are well known in the art and can be readily be adapted in order to obtain a 
sample which is compatible with the system utilized. 

In another embodiment of the present invention, kits are provided which contain the 
necessary reagents to carry out the assays of the present invention. SpecificaUy, the invention 
1 5 provides a compartment kit to receive, in close confmement, one or more containers which 
comprises: (a) a first container comprising one of the probes or antibodies of the present 
invention: and (b) one or more other containers comprising one or more of the following: wash 
reagents, reagents capable of detecting presence of a bound probe or antibody. 

In detail, a compartment kit includes any kit in which reagents are contained in separate 
containers. Such containers include small glass containers, plastic containers or strips of pllkic 
or paper. Such containers allows one to efficiently transfer reagents from one compartment to 
another compartment such that the samples and reagents are not cross-contaminated, and the 
agents or solutions of each container can be added in a quantitative fashion from one 
compartment to another. Such containers will include a container which will accept the test 
sample, a container which contains the antibodies used in the assay, containers which contain 
wash reagents (such as phosphate buffered saline. Tris-buffers, etc.). and containers which 
contain the reagents used to detect the bound antibody or probe. Types of detection reagents 
include labeled nucleic acid probes, labeled secondary antibodies, or in the alternative, if the 
primary antibody is labeled, the enzymatic, or antibody binding reagents which are capable of 
reacting with the labeled antibody. One skilled in the art will readily recognize that the disclosed 
probes and antibodies of the present invention can be readily incorporated into one of the 
established kit formats which are well known in the art. 

4.17 MEDICAL IMAGING 
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The novel polypeptides and binding partners of the invention are useful in medical 
imaging of sites expressing the molecules of the invention (e.g., where the polypeptide of the 
invention is involved in the ixnmune response, for imaging sites of inflammation or infection). 
See, e.g., Kunkel et aL, U.S. Pat. NO, 5,413,778. Such methods involve chemical attachment of 
a labeling or imaging agent, administration of the labeled polypeptide to a subject in a 
pharmaceutically acceptable carrier, and imaging the labeled polypeptide in vivo at the target 
site. 

4.18 SCREENING ASSAYS 

Using the isolated proteins and polynucleotides of the invention, the present invention 
further provides methods of obtaining and identifying agents which bind to a polypeptide 
encoded by an ORF corresponding to any of the nucleotide sequences set forth in SEQ ID NO:l- 
1 786 and 3573-5358, or bind to a specific domain of the polypeptide encoded by the nucleic 
acid. In detail, said method comprises the steps of: 

(a) contacting an agent with an isolated protein encoded by an ORP of the present 
invention, or nucleic acid of the mvention; and 

(b) determming whether the agent binds to said protein or said nucleic acid. 
In general, therefore, such methods for identifying compounds that bind to a 

pol3mucieotide of the invention can comprise contacting a compoxmd with a polynucleotide of 
20 the invention for a time sufficient to form a polynucieotide/compound complex, and detecting 
the complex, so that if a polynucieotide/compound complex is detected, a compoimd that binds 
to a polynucleotide of the invention is identified. 

Likewise, in general, therefore, such methods for identifying compounds that bind to a 
polypeptide of the invention can comprise contacting a compound with a polypeptide of the 
25 invention for a time sufficient to form a polypeptide/compound complex, and detecting the 
complex, so that if a polypeptide/compound complex is detected, a compound that binds to a 
polynucleotide of the invention is identified. 

Methods for idennfying compounds that bind to a polypeptide of the invention can also 
comprise contacting a compound with a polypeptide of the invention in a cell for a time 
30 sufficient to form a polypeptide/compound complex, wherein the complex drives expression of a 
receptor gene sequence in the cell, and detecting the complex by detecting reporter gene 
sequence expression, so that if a polypeptide/compound complex is detected, a compound that 
binds a polypeptide of the invention is identified. 

Compounds identified via such methods can include compounds which modulate the 
35 activity of a polypeptide of the invention (that is, increase or decrease its activity, relativl to 
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activity observed in the absence of the compound). Alternatively, compounds identified via such 
methods can include compounds which modulate the expression of a polynucleotide of the 
invention (that is, increase or decrease expression relative to expression levels observed in the 
absence of the compound). Compounds, such as compounds identified via the methods of the 
invention, can be tested using standard assays well known to those of skill in the art for their 
ability to modulate activity/expression. 

The agents screened in the above assay can be, but are not limited to, peptides, : 
carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents can be selected 
and screened at random or rationally selected or designed using protein modeling techniques. 

For random screening, agents such as peptides, carbohydrates, pharmaceutical agents and 
the like are selected at random and are assayed for their ability to bind to the protein encoded by 
the ORJF of the present invention. Altematively, agents may be rationally selected or designed. 
As used herein, an agent is said to be "rationally selected or designed" when the agent is chosen 
based on the configuration of the particular protein. For example, one skilled in the art can 
readily adapt currently available procedures to generate peptides, pharmaceutical agents and the 
like, capable of binding to a specific peptide sequence, in order to generate rationally designed 
antipeptide peptides, for example see Hurby et aL, Application of Synthetic Peptides: Antisense 
Peptides," In Synthetic Peptides, A User^s Guide, W.H, Freeman, NY (1992), pp. 289-307, and 
Kaspczak et aL, Biochemistry 28:9230-8 (1989), or pharmaceutical agents, or the like. 

In addition to the foregoing, one class of agents of the present invention,"as broadly ^- 
described, can be used to control gene expression through binding to one of the ORFs or EMFs 
of the present invention. As described above, such agents can be randomly screened or 
rationally designed/selected. Targeting the ORF orEMF allows a skilled artisan to design 
sequence specific or element specific agents, modulating the expression of either a single ORF or 
multiple ORFs which rely on the same EMF for expression control One class of DNA binding 
agents are agents which contain base residues which hybridize or form a triple helix formation 
by binding to DNA or RNA. Such agents can be based on the classic phosphodiester, 
ribonucleic acid backbone, or can be a variety of sulfhydryl or polymeric derivatives which have 
base attachment capacity. 

Agents suitable for use in these methods preferably contain 20 to 40 bases and are 
designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et ah, Nucl Acids Res. 6:3073 (1979); Cooney et aL, Science 241:456 (1988); and Dervan et 
al., Science 251 : 1360 (1991)) or to the mRNA itself (antisense - Okano, J. Neurochem. 56:560 
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, ^oca 
Raton, FL (1988)). Triple helix-formation optimally results in a shut-ojffof RNA transcription 
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from DNA. while antisense RNA hybridization blocks translation of an mRNA molecule into 
polypeptide. Both techniques have been demonstrated to be effective in model systems. 
Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide and other DNA binding agents. 

Agents which bind to a protein encoded by one of the ORFs of the present invention can 
be used as a diagnostic agent. Agents which bind to a protein encoded by one of the ORFs of the 
present invention can be formulated using known techniques to generate a pharmaceutical 
composition. » 

4.19 USE OF NUCLEIC ACIDS AS PROBES 

Another aspect of the subject invention is to provide for polypeptide-specific nucleic acid 
hybridization probes capable of hybridizing with naturally occurring nucleotide sequences. The 
hybridization probes of the subject invention may be derived from any of the nucleotide 
sequences SEQ ID NO:l-1786 and 3573-5358. Because the corresponding gene is only 
expressed in a limited number of tissues, a hybridization probe derived from of any of the 
nucleotide sequences SEQ ID NO:l-1786 and 3573-5358 can be used as an indicator of the 
presence of RNA of cell type of such a tissue in a sample. 

Any suitable hybridization technique can be employed, such as, for example, in situ 
hybridization. PGR as described in US Patents Nos. 4,683,195 and 4,965,188 provides 
additional uses for oUgonucleotides based upon the nucleotide sequences. Such probes usecm 
PGR may be of recombinant origin, may be chemically synthesized, or a mixture of both. The 
probe will comprise a discrete nucleotide sequence for the detection of identical sequences or a 
degenerate pool of possible sequences for identification of closely related genomic sequences. 

Other means for producing specific hybridization probes for nucleic acids include the 
cloning of nucleic acid sequences into vectors for the production of mRNA probes. Such vectors 
are known in the art and are commercially available and may be used to synthesize RNA probes 
in vitro by means of the addition of the appropriate RNA polymerase as T7 or SP6 RNA 
polymerase and the appropriate radioactiveiy labeled nucleotides. The nucleotide sequences may 
be used to construct hybridization probes for mapping their respective genomic sequences. The 
nucleotide sequence provided herein may be mapped to a chromosome or specific regions of a 
chromosome using well known genetic and/or chromosomal mapping techniques. TTiese 
techniques mclude in situ hybridization, linkage analysis against known chromosomal markers, 
hybridization screening with libraries or flow-sorted chromosomal preparations specific to 
known chromosomes, and the like. The technique of fluorescent in situ hybridization of 
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chromosome spreads has been described, among other places, in Verma et al (1988) Human 
Chromosomes: A Manual of Basic Techniques, Pergamon Press, New York NY. 

Fluorescent in situ hybridization of chromosomal preparations and other physical 
chromosome mapping techniques may be correlated with additional genetic map data. Examples 
5 of genetic map data can be found in the 1 994 Genome Issue of Science (265; 1 98 1 f). Correlation 
between the location of a nucleic acid on a physical chromosomal map and a specific disease (or 
predisposition to a specific disease) may help delimit the region of DNA associated with that 
genetic disease. The nucleotide sequences of the subject invention may be used to detect 
differences in gene sequences between normal, carrier or affected individuals. 

10 4.20 PREPARATION OF SUPPORT BOUND OLIGONUCLEOTIDES 

OUgonucleotides,i.e., small nucleic acid segments, may be readily prepared by, for 
example, directly synthesizing the oligonucleotide by chemical means, as is commonly practiced 
using an automated oligonucleotide synthesizer. 

Support bound oligonucleotides may be prepared by any of the methods loiown to those of 
skill in the art using any suitable support such as glass, polystyrene or Teflon. One strategy is to 
precisely spot oligonucleotides synthesized by standard synthesizers. Immobilization can be 
achieved using passive adsorption (Inouye & Hondo, (1 990) J. Clin. Microbiol. 28(6) 1469-72); 
using UV light (Nagatae/ cf/., 1985; Dahlen et al.. 1987; Morrissey & Collins. (1989) Mol. Cell 
Probes 3(2) 1 89-207) or by covalent binding of base modified DNA (Keller et al, 1 988; 1 989); all 
20 references being specifically incorporated herein. 

Another strategy that may be employed is the use of the strong biotin-streptavidin 
interaction as a linker. For example, Broudee/ a/. (1994) Proc. Nati. Acad. Sci. USA 91(8) 3072-6, 
describe tiie use of biotinylatedprobes, altiiough these are duplex probes, that are immobilized on 
sfareptavidin-coated magnetic beads. Streptavidin-coated beads may be purchased from Dynal, Oslo. 
Of course, this same linking chemistry is applicable to coating any surface with sti-eptavidin. 
Biotinylatedprobes may be purchased from various sources, such as, e.g., Operon Technologies 
(Alameda, CA). 

Nunc Laboratories (Naperville. IL) is also selling suitable material that could be used. Nunc 
Laboratorieshave developed a method by which DNA can be covalently bound to the microwell 
surface termed Covalink NH. CovaLuik NH is a polystyrene surface grafted with secondary amino 
groups (>NH) that serve as bridge-heads for further covalent coupling. CovaLink Modules may be 
purchased from Nunc Laboratories. DNA molecules may be bound to CovaLink exclusively at the 
S'-end by a phosphoramidatebond, allowing immobilization of more tiian 1 pmol of DNA 
(Rasmussen et ai, (1991) Anal. Biochem. 1 98(1) 1 38-42). f 
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The use of CovaLink NH strips for covalent binding of DNA molecules at the S'-end has 
been described (Rasmussen et aL, (1991), In this technology, a phosphoramidatebond is employed 
(Chu et aL, (1 983) Nucleic Acids Res. 1 1(8) 6513-29). This is beneficial as immobilization using 
only a single covalent bond is preferred. The phosphoramidate bond joins the DNA to the 
5 CovaLink NH secondary amino groups that are positioned at the end of spacer arms covalently 
grafted onto the polystyrene surface through a 2 nm long spacer arm. To link an oligonucleotide to 
CovaLink NH via an phosphoramidate bond, the oligonucleotide terminus must have a 5'-ehd 
phosphate group. It is, perhaps, even possible for biotin to be covalently bound to CovaLi4: and 
then streptavidinused to bind the probes. 
1 0 More specifically, the linkage method includes dissolvmg DNA in water (7.5 ng/ul) and 

denaturing for 10 min. at 95°C and cooling on ice for 10 min. Ice-cold 0.1 M 1-methylimidazole, 
pH 7,0 (l-Melm?), is then added to a final concentration of 1 0 mM l-Melm?. A ss DNA solution is 
then dispensed into CovaLink NH strips (75 ul/weil) standing on ice. 

CarbodiimideO.2 M l-ethyl-3^(3-dimethylaminopropyl)-carbodiimide(EDC), dissolved in 
15 lOmM l-MeIm7,ismadefreshand25uladdedperwelL The strips are incubated for 5 hoiurs at 
SO'^C. After incubation the strips are washed using, e.g., Nunc-Immuno Wash; first the wells are 
washed 3 times, then they are soaked with washing solution for 5 min,, and finaUy they are washed 
3 times (where in the washing solution is 0.4 NNaOH, 0.25% SDS heated to 50°C). 

It is contemplated that a ftirther suitable method for use with the present invention is that 
described in PCT Patent Application WO 90/03382 (Southern & Maskos), incorporated herein^^y 
reference. This method of preparing an oligonucleotide bound to a support mvolves attachmg a 
nucleoside 3 '-reagent through the phosphate group by a covalent phosphodiester link to aliphatic 
hydroxy] groups carried by the support. The oligonucleotide is then synthesized on the supported 
nucleoside and protecting groups removed from the synthetic ohgonucleotide chain nnder standard 
25 conditions that do not cleave the oligonucleotide from the support. Suitable reagents include 
nucleoside phosphoraraidite and nucleoside hydrogen phosphorate. 

An on-chip strategy for the preparation of DNA probe for the preparation of DNA probe 
arrays may be employed. For example, addressable laser-activated photodeprotection may be 
employed in the chemical synthesis of oligonucleotides directly on a glass surface, as described by 
30 Fodoretal (1991) Science 25 1(4995) 767-73, incorporated herein by reference. Probes may also 
be immobilized on nylon supports as described by VanNess etaL (1991)Nucleic Acids Res. 
19(12) 3345-50; or linked to Teflonusmg the method of Duncan & Cavalier (1988) Anal. Biochem. 
169(1) 104-8; all references being specifically incorporated hereuL 
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To link an oligonucleotide to a nylon support, as described by Van Ness et al (1 99 1), 
requires activation of the nylon surface via alicylation and selective activation of the 5'-amine of 
oligonucleotides with cyanuric chloride. 

One particular way to prepare support bound oligonucleotides is to utilize the 
5 light^generated synthesis described by Pease et aU (1 994) PNAS USA 91 (11) 5022-6, incorporated 
herein by reference). These authors used current photolithographictechniques to generate arrays of 
immobilized ohgonucleotide probes (DNA chips). These methods, in which light is used to direct 
the synthesis of oligonu. ' eotide probes in high-density, miniaturized arrays, utilize photolabile 
5'-protectedA^-acyl-deoxynucleosidephosphoramidites, surface linker chemistry and versatile 
1 0 combinatorial synthesis strategies. A matrix of 256 spatially defined oUgonucleotide probes may be 
generated in this manner. 

4,21 PREPARATION OF NUCLEIC ACID FRAGMENTS 

The nucleic acids may be obtained frora any appropriate source, such as cDNAs, genomic 

DNA, chromosomal DNA, microdissected chromosome bands, cosmid or YAC inserts, and RNA, 
15 including mRNA without any ampHfication steps. For example, Sambrooke/ a/. (1989) describes 

three protocols for the isolation of high molecular weight DNA from mammalian cells (p. 

9.14-9.23). 

DNA fragments may be prepared as clones in M 1 3, plasmid or lambda vectois and/or 
prepared direcdy from genomic DNA or cDNA by PGR or other amplification methods. Samples 

20 may be prepared or dispensed in multiwell plates. About 1 QOAQOO ng of DNA samples may be 
prepared in 2-500 ml of final volume. 

The nucleic acids would then be fragmented by any of the methods known to those of skill 
in the art including, for example, using restriction enzymes as described at 9.24-9.28 of Sambrook et 
al (1 989), shearing by ultrasound and NaOH treatment. 

25 Low pressure shearing is also appropriate, as described by Schriefer et al (1 990) Nucleic 

Acids Res. 18(24)7455-6,incorporatedhereLn by reference). In this method, DNA samples are 
passed through a small French pressure ceU at a variety of low to intermediate pressures. A lever 
device allows controlled application of low to intermediate pressures to the cell. The results of tliese 
studies indicate that low-pressure shearing is a useful alternative to sonic and enzymatic DNA 

30 fragmentationmethods. 

One particularly suitable way for fragmenting DNA is contemplated to be that using the two 
base recognition endonuclease, Cv/JI, described by Fitzgerald et al (1 992) Nucleic Acids Res. 
20(14) 3753-62. These authors described an approach for the rapid frag?nentation and fractionation 
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Of DNA into particular sizes that they contemplated to be suitable for shotgun cloning an4 • 
sequencing. 

The restriction endonuclease CviJl normally cleaves the recognition sequence PuGCPy 
between Uie G and C to leave blunt ends. Atypical reaction conditions, which alter the specificity of 
5 this enzyme (Cv/JI* *), yield a quasi-random distribution of DNA fragments foiro the small 
moleculepUC19 (2688 basepairs). Fitzgerald etal. (1992) quantitatively evaluated tlie " 
randomness of this fragmentation strategy, using a Cv/JI** digest of pUC19 that was size 
fractionatedby a rapid gel filtration method and directly ligated, without end repair, to a lacZ minus 
MB cloning vector. Sequence analysis of 76 clones showed that CviJI** restricts pyGCPy and 
10 PuGCPu, in addition to PuGCPy sites, and that new sequence data is accumulatedat a rate 
consistent with random fragmentation. 

As reported in the literature, advantages of this approach compared to sonication and 
agarose gel fractionation include: smaUer amounts of DNA are required (0.2-0.5 ug instead of 2-5 
ug); and fewer steps are involved (no preligation, end repair, chemical extraction, or agarose gel 
1 5 electrophoresis and elution are needed 

Irrespective of the manner in which the nucleic acid fragments are obtained or prepared, it is 
importantto denature the DNA to give single stranded pieces available for hybridization. This is 
achieved by incubating the DNA solution for 2-5 minutes at 80-90°C; The solution is then cooled 
quickly to 2''C to prevent renaturation of the DNA fragments before they are contacted wift the 
chip. Phosphate groups must also be removed from genomic DNA by methods known in the ait,; 

4^2 PREPARATION OF DNA ARRAYS 

Arrays may be prepared by spotting DNA samples on a support such as a nylon membrane. 
Spotting may be performed by using arrays of metal pins (the positions of which correspond to an 
array of wells in a microtiterplate) to repeated by transfer of about 20 nl of a DNA solution to a 
25 nylon membrane. By offset printing, a density of dots higher than the density of the wells is 

achieved. One to 25 dots may be accommodated in 1 ntm\ depending on the type of label used. By 

avoiding spotting in some preselectednumberof rows and columns, separate subsets(subaiTays) 
maybe foimed. Samples in one suban-aymay be the same genomic segment of DNA (or the same 
gene) from different individuals, or may be different, overlapped genomic clones. Each of the 
subarraysmayrepresentreplica spotting of the same samples. In one example, a selected gene 
segment may be amplified from 64 patients. For each patient, the amplified gene segmentmay be in 
one 96-well plate (all 96 wells containmg the same sample). A plate for each of the 64 patients is 
prepared. By using a 96-pin device, all samples may be spotted on one 8 x 1 2 cm membiane. 

f 
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Subarraysmay contain 64 samples, one from each patient. Where the 96 subarrays are identical, the 
dot span may be I mm^ and there may be a 1 nun space between subairays. 

Another approach is to use membranes or plates (available from NUNC, Naperville, Illinois) 
which may be partitioned by physical spacers e.g. a plastic grid molded over the membrane'the grid 
being similar to the sort of membrane applied to the bottom of multiwell plates, or hydrophobic 
strips. A fixed physical spacer is not preferred for imaging by exposure to flat phosphor-storage 
screens or x-ray films. 

The present invention is illustrated in the following examples. Upon consideratioh of the 
present disclosure, one of skill in the art will appreciate that many other embodiments and variations 
may be made in <he scope of the present invention. Accordingly, it is intended that the broader 
aspects of the present invention not be limited to the disclosure of the following examples. Hie 
present inventionis notto be limited in scope by the exemplified embodiments which are intended 
as illustrations of single aspects of the invention, and compositions and methods which are 
functionally equivalent are within the scope ofthe invention. Indeed, numerous modifications and 
variations in the practice of the invention are expected to occur to those skilled in the art upon 
considerationof the present preferred embodiments. Consequently, the only limitations which 
should be placed upon the scope ofthe invention are those which appear in the appended claims. 

All references cited within the body ofthe instant specification are hereby incorporated by 
reference in their entirety. 

20 5.0 EXAMPLES 

5.1.1 EXAMPLE 1 

Novel Nucleic Acid Sequen ces Obtained F rom Varimw T ihr^ri^ 
A plurality of novel nucleic acids were obtamed from cDNA libraries prepared from various 
human tissues and in some cases isolated from a genomic library derived from human chromosome 

using standardPCR,SBHsequencesignatureanalysisandSangersequencingtechniques. TTie 
inserts ofthe library were amplified with PGR using primers specific for the vector sequences which 
flank the inserts. Clones from cDNA libraries were spotted on nylon membrane filters and screened 
with oligonucleotideprobes (e.g., 7-mers)to obtain signature sequences. The clones were clustered 

into groupsof similar or idenUcal sequences. Representativeclones were selectedfor sequencing. 
3 0 In some cases, the 5' sequence ofthe amplified inserts was then deduced using a typical 

Sangersequencingprotocol. PGR products were purified and subjected to fluorescentdye 
terminatorcycle sequencing. Single pass gel sequencing was done using a 377 Applied Biosystems 
(ABI) sequencer to obtain the novel nucleic acid sequences. In some cases RACE (Randon^ 
Amplificationof cDNAEnds) wasperfomied to fiirther extend the sequence in the 5' direction. 
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5.1.2 EXAMPLE 2 
Assemblage of Novel Nucleic Acids 

The contigs or nucleic acids of the present invention, designated as SEQ ID NO: 3573-5358 
were assembled using an EST sequence as a seed. Then a recursive algorithm was used to extend 
5 the seed EST into an extended assemblage, by pulling additional sequences from dififerent databases 
(i.e., Hyseq's database containing EST sequences, dbEST version 1 14, gb pri 1 14, and UniGene 
version 101) that belong to this assemblage. The algorithm terminated when there was no J 
additional sequences from the above databases that would extend the assemblage. Inclusion of 
component sequences into the assemblage was based on a BLASTN hit to the extending assemblage 
1 0 with BLAST score greater than 300 and percent identity greater than 95%. 

A polypeptide was predicted to be encoded by each of SEQ IDNO:3573-5358as set forth 
below. The polypeptides was predicted using a software program called FASTY (available from 
http://fasta.bioch.virginia,edu') which selects a polypeptides based on a comparison of translated 
novel polynucleotide to known polynucleotides (W.R- Pearson, Methods in Enzymology, 183:63-98 
1 5 (1 990), herein incorporated by reference. The predicted polypeptides are shown in Table 7, 

5.2.2 EXAMPLES 
Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frarne 

20 shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genebank. Other computer programs which may 
have been used in the editing process were phredPhrapand Consed (University of Washington) and 
ed-ready, ed-ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants 
resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS: 1- 327. 

25 Table 1 shows the various tissue sources of SEQ ID NO : I -327. 

The nearest neighbor results for SEQ ID NO: 1-327 were obtained by a PASTA version 3 
search against Genpept release 117, using FASTXY algorithm. FASTXY is an improved 
version of FASTA alignment which allows in-codon ft^me shifts. The nearest neighbor result 
showed the closest homologue for SEQ ID NO: 1-327 from Genpept . The translated amino acid 

30 sequences for which the nucleic acid sequence encodes are shovm in the Sequence Listing. The 
nearest neighbor results for SEQ ID NO: 1-327 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol, Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signatiire regions. Table 3 shows t]je 
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signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. • 

Using tlie pFam software program (Sonnliammer et aL, Nucleic Acids Res., VoL 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. * 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP Vl.l program (fi-om 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaiyotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp, 1-6 (1997), incorporated herein by 
reference, A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

53.2 EXAMPLE 4 

^\ ., 

Novel Nucleic Acids 

Using PHRAP (Univ, of Washington) or CAP4 (Paracel), a fiill length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any fi:ame 
shifts and incorrect stop codons were corrected by hand edhing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 17, gb pri 1 1 7, 
UniGene version 1 1 7, Genpept release 117). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-2ip-2 (Hyseq, Inc.), The fiall-length nucleotide, including splice variants resulting firom 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 328-1 41 3. 

Table 1 shows the various tissue sources of SEQ ID NO: 328-1413. 

The nearest neighbor results for SEQ ID NO: 328-1413 were obtained by a BLASTP 
version 2.0al 19MP-WashU search against Genpept release 1 1 8, using BLAST algorithm. The 
nearest neighbor result showed tiie closest homologue for SEQ ID NO: 328-1413 from Genpept 
The translated amino acid sequences for which the nucleic acid sequence encodes are shown in 
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the Sequence Listing. The nearest neighbor results for SEQ ID NO: 328-1413 are shov^'in 
Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
BioL, Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
exanained to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et aL, Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examuxed for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be detennine from using Neural Network Signal? VLl program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaiyodc and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp, 1-6 (1997). incorporated herem by 
reference. A maximum S score and a mean S score, as described in the Nielson et as referer^e, 
was obtained for the polypeptide sequences.- Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 



25 5.3,2 EXAMPLES 

Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
30 checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 17, gb pri 1 1 7, 

UniGene version 1 1 7, Genpept release 1 1 7). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide sequences, including spHce variants 
resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS: 1414-1^52, 
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Table 1 shows the various tissue sources of SEQ ID NO: 1414-1 652. - ' 

The nearest neighbor results for SEQ ID NO: 1414-1652 were obtained by a BLASTP 
version 2.0ai I9MP-WashU search against Genpept release 1 18, using BLAST algorithm. The 
nearest neighbor resuh showed the closest homologue for SEQ ID NO: 1414-1652 from 
Genpept. The translated amino acid sequences for which the nucleic acid sequence encodes are 
shown in the Sequence Listing. The nearest neighbor results for SEQ ID NO: 1414-1652 are 
shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al, J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
•examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix P'Value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al. Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains wdth homology to certain peptide domains. Table 4 shows the name of 
the domain fotmd, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence vwthin the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network Signal? VM program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The pro Jfess 
for identifying prokaryotic and etikaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 1 0, no. 1 , pp. 1 -6 ( 1 997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5,4,2 EXAMPLE 6 
Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or C AP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 1 8, gb pri 1 1"!, 



100 



'^OiW533J2 PCT/liSOO/34263 

UniGene version 1 1 8, Genpept release 1 1 8). Other compuici programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-^ready, ed- 
ext and gc-zip^2 (Hyseq, Inc.). The fiilMength nucleotide sequences, including splice variants 
resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS: 1 653-1 745. 
Table 1 shows the various tissue sources of SEQ ID NO: 1653-1745. 
The homology for SEQ ID NO: 1653-1745 were obtained by a BLASTP version 2.0al 
19MP-WashU search against Genpept release 118, usmg BLAST algorithm. The results showed 
homologues for SEQ ID NO: 1653-1745 from Genpept The translated amino acid seqi^ences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing, The homologues 
widi identifiable functions for SEQ ID NO: 1653-1745 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al,, J. Comp. 
BioL, Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al.. Nucleic Acids Res,, Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network Signal? VLl program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaiyotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.5.2 EXAMPLE? 
Novel Nucleic Acids 
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Using PHRAP (Univ. of Washington) or CAP4 (Paracel),a foil length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 1 9, gb pri 1 1 9. 
5 UnlGene version 1 19, Genpeptrelease 119). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 1746-1768. 
Table 1 shows the various tissue sources of SEQ ID NO: 1 746-1 768. 
1 0 The homology for SEQ ID NO: 1 746-1 768 were obtained by a BLASTP version 2.0al 

19MP-WashU search against Genpeptrelease 119, using BLAST algorithm. The results showed 
homologues for SEQ ID NO: 1746-1768 from Genpept. The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologues 
with identifiable ftmctions for SEQ ID NO: I746-I768 are shown in Table 2 below. 
15 Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Corap. 

Biol., Vol. 6 pp. 21 9-235 (1 999) herein incorporated by reference), all the sequences were examined 
to determine whether they had identifiable signature regions. Table 3 shows the signature region 
found in Hie indicated polypeptide sequences, the description of the signature, the eMatrix p- 
value(s) and the position(s) of the signature widiin the polypeptide sequence. 
20 Using the PFam software program (Sonnhammer et al.. Nucleic Acids Res., Vol. 26(1) 

pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the PFam score for the identified domain 
within the sequence. 

25 The nucleotide sequence within the sequences that codes for signal peptide sequences and 

their cleavage sites can be deteimine from using Neural Network SignalP Vl.l program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process for 
identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also disclosed by 
Henrik Nielson, Jacob Engelbrecht, Soren Brunak. and Gunnar von Heijne in the publication « 

30 Identification of prokaiyotic and eukaryotic signal peptides and prediction of their cleavage sites" 
Protein Engineering, Vol. 10, no. l,pp. 1-6 (1997), incorporated herein by reference. Amaximum 
S score and a mean S score, as described in die Nielson et as reference, was obtained for the 
polypeptide sequences. Table 5 shows the position of the signal peptide in each of the polypeptides 
and the maximum score and mean score associated with that signal peptide. 
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5.6.2 EXAMPLES ; 

Novel Nucleic An'ds 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
5 shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 20, gb pri 1 20; 
UniGene version 120, Genpept release 120). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.) . The translated amino acid sequences for which the nucleic acid 
10 sequence encodes are shown in the Sequence Listing. The fiill-length nucleotide, including splice 
variants resulting from these procedures are shown in the Sequence Listing as SEQIDNOS: 1769- 
1786. 

Table 1 shows the various tissue sources of SEQ ID NO: 1 769-1 786. 
The homology for SEQ ID NO: 1769-1786 were obtained by a BLASTP version 2.0al 
1 5 1 9MP-WashU search against Genpept release 120 and the amino acid version of Geneseq 
released on October 26, 2000, using BLAST algorithm. The results showed homologues for 
SEQ ID NO: 1769-1786 from Genpept. The homologues with identifiable fimctions for SEQ ID 
NO: 1769-1786 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J, Comp. 
20 Biol, Vol. 6 pp. 2 1 9-235 (1 999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position($) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnliammer et al.. Nucleic Acids Res., Vol, 26(1) 
25 pp. 320-322 (1 998) herein incorporated by reference) all the polypeptide sequences were 

examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
3 0 their cleavage sites can be determine from using Neural Network Signal? V 1 . 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielsen, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
35 cleavage sites" Protein Engineering, Vol 10, no. 1, pp. 1-6 (1997), incorporated herein by* 
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reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

Table 6 is a correlation table of all of the sequences and the SEQ ID NOS. 
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TABLE 1 



Tissue Origin 
adul t brain 



RNA Source 



GIBCO 



Hyseq 
Library ^3ame 



AB3001 



adult brain^ 



GIBCO 



ABD003 



SEQ ID NCS : 



9 19-21 50-Sl 65-66 72 78 
as 87 107-X08 113 116 123 



140 150-152 
202-203 212 
251 2S8 268 
298 301 321 
357 362 369 
443 459-460 
SOD 503 519 
608-609 613 
652 657-658 
695 697 710 



1S9 
■214 
269 
326 
379 
473 
526 
618 
660 
715 



80 82 
138 



169 177 192-193 
225-226 23S-236 
272 280-281 295 
331-332 334 
382-383 416 
475 477 488 
547 574 582 
633-634 645-646 
669-671 678 687 
724 731 775-777 



356- 
423 
496 
587 



1067 1070 
1116-1117 
1149 1151 



1234 
1279 
1312 
1361 
1400 
14 94 



1241 



1157 
1243 



796 804 811 857-859 862 869 899- 
900 912 919 922 924-929 933 936 
962 979 988-989 996 lOOl 1004- 
lOOa 1018 1039 1047 1059 1064 
1078 1082 1107 1113 
1131 1134-1137 1140 
1180 1206 1229 
1258 1272-1273 
1294 1307-1308 
1330 1356 1360- 
1368 1373-1375 1379 1391 
1417 1446 1468 1482 1493- 
1501-1503 1506-1507 1512 
1517 1522-1524 1530-1533 1537 
1549 1565 1578 1S98 1606 1608 
1623 1625 1627 1639 1643 1648- 
1649 1653 1664 1667 1671 1696 
1734 1741 1743-1744 1760-1761 
1771 

18-19 25 30-31 34-36 43- 



1288-1290 
1320 1323 



3 12-14 

45 50-51 56 58 60 65-66 68-69 80 
82 as 87 92 104 107^108 112-113 
115-116 123-124 131-132 135-137 
139 142 146 148-149 1S2 154 157 
159 163 165 167 169 172 180 192- 
193 196-197 199 203 208 210 212- 
214 223 233 235-237 247 257 2S9 
261 268-269 272 276 280-281 284- 
288 291-292 295 297 300-301 304 
307 317 320-321 323 327 329-331 
333-334 345-349 356-357 379-381 
393 401 408 414 419 424 426-428 
430 433-436 438-439 443 445 449 
453-454 459-461 468 471-473 476- 
478 483 491 494 496 500 503 507- 
508 516 519-520 525-527 534 536- 
540 542-S43 545 553 555 560 569- 
570 574-576 586-588 593 595 597 
601 606-609 61S-620 622-623 62S 
628-633 635-636 643 645-649 653 
655-656 660-665 668-670 676 681 
087 701 710 715 717 724-728 735 
743 745-746 750 753 759 765-766 
773 775-778 786 789 796 799-800 
802-803 810-811 815 817 820-821 
832 834-836 840 845-847 851 858- 
SGl 864 869 874 878 883 897 901- 
902 904-905 908 911-914 916 921- 
922 924-927 929 932-934 336-939 
941-942 945 955-958 963 966-969 
977 979-980 985-986 990 992-993 
997-1001 1005-1007 1012 1017- 
1020 1023-1024 1029-1031 1034 
1036 1039 1050 1059 1063-1066 
1078 1081-1082 1O85-10S6 1089 
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Tissue Origin 



RNA Source 



adult brain 



adult brain 



adult brain 



Ilyaeq 
Library Name 



CXontech 



Clontech 



ABROOl 



Clontech. 



SEQ ID NOS: 



1097 1103 1107 1109 1112 11X6- 

1117 1119 1121 1124 1127 1130 

1134 1144-1145 1149 1151 11S7- 

1158 1167 1170 1178 1184 1188 

1190 1193-1194 1200 1202 121S- 

1217 1220 1226-1227 1229 

1241 1243 1247 1252 1258 

1267 1269 1279 1281 1284 



1231 
1263 
1286- 
1306-1307 1312 



ABR008 



1289 1293-1294 
1316-1320 1326 1333 1338 1341 
1344 1348 1351 1355-1357 1368 
1374 1377 1380 1386 1389-1390 
1394 1400 1409 1414 1422-1423 
1425-1427 1437 1443 1446 1454 
1456 1458-1459 1468 1470-1472 
1478 1482-1483 1487-1488 1493 
1497 1499 1506 1508-1511 1S17 
1522-1524 1530-1533 1S4S-154S 
1548-1550 1552 1SS7 -1SS9 • 1563 
1S65 1S67 1569 1571 1S86 1588 
1591 1593 1S9S 1598-1601 1608 
1611 1620-1621 1624-1626 1628 
1630-1632 1636 1640-1641 1644- 
1645 1547 1649 1653-1655 1657 
1664 1667 1669 1673 1678-1681 
1686 1690 1694-1696 1701 1709 
1711 1719 1722-X723 1726-1727 
1731-1733 1738 1740 1743-1744 
1747 1749 1753 1757-17S8 1760- 

1761 1765 1771 1785 

9 29 68-69 113 115 146 152 206 
223 245 277 307 320 324 330-331 
344 348 352 362 379 384 393 404 
408 414 441-442 454 46^ 481 490 
S06 517 586 597 631 641 659 691 
715 799 003 833 865 871 875 880 
882 908 920 937 1000 1005-1006 
1027 1036 1041 1043 1075 1107 
1112 1121 1127 1136-1137 1144- 
1147 1231 1238-1239 1280 1293 
1320 1345 1355 1361 1383-1384 
1400 1417 1448 1456 1476 1507 
1570 1572 1603-1610 1614 1620 
1626 1645 1653 17S4 175S 1770 
1786 



5-8 15-16 168 212-213 271 278 
280-281 291-292 300-301 310 314 
321 326 336-338 341 352 357 359- 
360 362 369 374 379 384 393 396- 
397 414 419-420 426-428 430 441- 
442 453 506 616-617 661 689 785 
798 845 1018 1109 1113 1124 1148 
1167 1187 1207 1227 1252 1265 
X28S 1312 1317-1319 1324-1327 
1344 1369 1381 1400 1416 1421 
1427 1430-1431 1436 1471 1501 
15S7-1559 1586 1588 1651 1653 
1664-1655 1671 1673 1690 1697- 
1698 1700 1711 1717 1719-1720 
1728 1736 1740 1743-1744 1757 
1760-1761 



5-10 13-19 22-23 25 29 33 37-39 
43^45 SO-Sl 54-55 57-53 60-66 
60-70,72 75 77-60 83 85 89-92 94 
99-105 108-110 112-113 116-117 
123 128 133 135-137 139 143 145- 
146 148 152 154-155 157 166 168- 
172 174-17S 181-184 188-190 193- 
194 196 198-200 202 204-205 207- 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



S2Q ID NOS: 



208 210 214-215*218 221-226 2Ts~ 
231-232 234-241 245-247 251-253 
2SS 2S7-259 268-269 271 276-281 
285*286 288 290-292 300-302 304 
307 309-311 313 315 317-318 320- 
322 32S-326 328 330-331 333-338 
341 344-347 349 352 3S4 356-357 
362 365-373 376 379-380 382 384 
387 390-3S1 393-394 397 399-403 
405-411 414-415 417-420 426-428 
437-438 440-444 4S3-43S 462 464 
467 469-471 476 478 432-484 488- 
491 497 503 S06-S13 516-517 520 
524-526 528-530 532-534 S37-540 
542 544 547-SSl 553 561 565-567 
572-574 577 531 585 537-588 590- 
591 597 599 601-602 606-610 612 
615-617 619-620 622-633 €28-629 
631 633-634 636-641 643 645-647 
651-653 655-664 669-671 673 679 
682 687 689 691-700 702 706 710 
715-717 720-721 72S-734 736-739 
742-743 746 750-752 756 758-759 
762-764 766 768 773-778 780-782 
784-785 787-789 794 796 799 802- 
803 805 811 814-815 818 825-826 
834-837 839-840 842-843 856-859 
861-862 865 867-872 874-87S 881 
883-084.887 889-852 894-B5S 897- 
898 901 904 908 910 912 914 917 
919 921-924 926-927 930-932 935- 
941 943 945 949 9S3-9S4 958 ^6l- 
963 967 969 971 975 977 981-983 
986 988-990 992 997 999-1002 
1004-1006 1008 1012 1018-1023 
1027 1029-1031 1035-1037 1047- 
1048 1053 10S7 1059 1063 1068 
1070 1072-1075 1077 1081-1083 
lOaS-1093 1095-1096 1108-1112 
1114-1125 1127 1131-1133 1135- 
1138 1142-1145 1148-1158 1160- 
1163 1167 1169 1172 117S 1177 
1180 1183-1188 1191-1195 1199- 
1200 1204 1206 1211 1213-1216 
1222-1223 1226-1227 1229-1231 
1234-1235 1241-1242 1244-1263 
1266 1269-1271 1276-1277 1279- 
1281 1284-1286 1292 1294-1295 
1299 1305-1309 1312 1314 1316- 
1319 1322 1324-1327 1330 1332 
1334-1335 1339 1344-1346 1351 
1354-1355 1357-1358 1365-1367 
1363-1370 1373-1374 1376-1379 
1381-1384 1386-1388 1392 3.394 
1396-1337 1400 1403-1407 1410 
1414 1419-1420 1423 1432-1433 
143S 1437-1438 1440-1442 1446 
1448 14S3-145S 1457 1461 1463- 
1464 1466 1468 1471 1477 1480 
1482-1483 1496 1502-1504 1507- 
1509 1513 1519-1520 1524-1526 
1536 1547 1549-1552 1567 1573- 
1S74 1578 1586-1589 1S97-1S98 
1601-1602 1605 1607-1609 1611- 
1617 1619-1621 1623 1625-1626 
1635-1641 1643-1645 1649 1651 
1653 1656-1658 1664 16S9 1671- 
1674 1676-1684 1686 16S9-1690 
1694-1696 1704-X70S 1708-1709 
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Tissue Origin | RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1720-1724 1726-1728 1730-1733 
1737-1740 1742-1745 17S3 1755- 
1757 17S9-1761 1765 1767 1771- 
1772 1776-1777 1779-1780 1786 



adult brain 



ABROll 



24 75 103 186 210 310-31Z 364- 
365 508 623 710 937 1002-1003 
10S9 1204 1609 1731-1732 



adult brain 



flioChain 



46 182-184 204-205 300 739 767 
1371 1549 1620 1684 



185 204-205 364-365 393 497 595 
687 692-694 830 845 1068 1320 
1413 1640 



adult brain 



Invitrogen 



adult brain 



Invitrogen. 



187 301 357 364-365 375 454 463 
731 859 939 983 1073 1262 1270 
1320 1403 1640 16S1 16S7 1696 
1722 1738 



adult brain 



Invxtrogen 



ABR015 



419 434-43S 441-442 763 789 983 
1320 



adult brain 



Invitrosfen 



312 364-365 379 1320 1334-1335 
1674 1722 1785 



adult bra in 



Invitrpgen 



14-16 22-23 25 37-3 
70-72 78 86 94 107 
137 143 146 152 l6l 
194 196 198 210 218 
295 298 309-310 320 
338 346-347 349-350 
371 379-380 382-383 
399 401 408 428 438 
482 490 502 507-509 
55*7 562 597 602 607 
65S 667 669 671-672 
696 710 7X2 715 721 
750 753 766 776 780 
814 826 830 837 841 
894-895 925 937 949 
961 963 968-5?69 988 
1005-1006 1016-1019 
1037 1052 1086 1090 
1115 1120-1121 1123 
1137 1140 1144-1147 
1170 1174 1188 1193 
1225 1229 1231 1254 
1280 1285 1309 1312 
1341 1343-1344 1356 
1378-1379 1383-1384 
1423 1429 1434 1442 
1452 1454 1470-1472 
1525 1528-1529 1532 
1554 1557-1559 ISSl 
1535 1588 1590 1595 
1608 1610-1613 1615 
1627 1640 1644 1647 
1666 1^70 1675 1696; 
1723 1727 1738 1760 
1779 1785-1786 



874 

960- 



9 43 58 60 
113 116 136- 
173 182-184 
229 259 267 
-321 324 336- 
356-357 362 
391 393 396 
459 461 476 
516 526 531 
609 624 652 
687-689 695- 
732 739 743 
781 789 803 
857 869 
954-956 
989 1000 
1021 1036- 
1109 1113 
-1124 1136- 
1151 1167 
1194 1205 
1258 1262 
1334-1335 
1357 1370 
1403-1404 
1448 1451- 
1482 1499 
153 6 1547 
1562 1S67 
1601-1604 
1619 1624 
1660 1664 
1704 1715 
-1761 1768 



cultured 
preadipocytes 



Strategene 



5-8 11 17 25 68-69 
105 110 116 I36-13S 
189 196-198 261 267 
301 318 331 336-338 
400 42S 430-431 510 
527 549 557 561 602 
631 637 647 670 681 
748 782 793-794 817 
645 858-859 879 882 
960 982 986 995-996 
1005-1007 1025 1027 
1039 1045 1071 1078 
1102 1136-1137 1140 



80 82 87 103 
168 171 188- 
276 288 293 
379-380 391 
512 520 524 
618 620 622 
-682 710 731 
834-836 843 
893-895 934 
1000 1002 
1028 1032 
1097 1095- 
1219-1220 
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Tissue 


Origin 


RNA Source 


Hyseq 




ID NOS: 










Library Name 


















1260 1271 1297- 


1298 


1314 1320 










1322 1329 1339 


1345 


1365-1366 










1370-1371 1398 


1408 


1423 1431 










1437 1466 14GS 


1533 


1539 1594 










1602 1608 1614 


1631 


.164 


9-1650 










1660 1652 1673 


1687 


-168 


a 1696 










1711 1719-1720 


1742 


174 


6 1749 










1760-1761 1765 


1767 


1771 1785 


adrenal 


gland 


Cl on tech 


ADR002 


4-10 15-16 25 29-31 


43- 


45 47 SO- 










51 SS 60 62-63 


65-6 


6 75 


80 102 










116 118 122 126 130 


137 


ISO 169- * 










170 181 192 196 


201 


-203 


21S 227- ■ 










228 247 251 255 267 


-269 


271 280- 










7ft1 9ft^ "^QR "Jcifl 


311 


336 


-338 342 










349 351-352 35< 


372 


-373 


383-385 










391 400 410 41S 


-416 


424 


426-427 










431 434-437 43£ 


445 


454 


461 473 










477 483 491 4 93 

^ J 


497 


-498 


503 516 










Sl9 527 53 5 546 


549 


552 


572-573 










581 588 595 600 


602 


608 


-610 620 










628-630 637 645 


-646 


670 


679 703 










713 715 719 732 


734 


744 


-^746 758 










773-778 789 816 


829 


B37 


845 848 










869 87S 883 B9£ 


904 


912 


922-923 










930-931 942 948 


952 


955 


967 969 










976-977 981 990 


992 


-993 


1001 










1004 1049 1055 


1059 


1071-1072 










1076 1112-1113 


1115 


1121 1127 










1134-1135 1151 


1158 


1163 1175 










1181 1188 1209 


1218 


1224-1225 










1227 1231 1243 


1270 


-1271 1274 










1280 1285 1290 


1293 


1307 1324- 










132S 1327 1330 


1342 


-134 


3 1345 










1348 136S-1366 


1369 


1378-1379 










1387 1398 1400 


1405 


1417 1425- 










1426 1436 1440- 


1441 


144 


i 1454 










1463-1464 1488 


1491 


1507 1512 










1538 1546 1567 


1573- 


-1575 1588 










1598 1609 1614 


1618 


1622 1624 










1627 1PT4 1636 


1649 


1651 16S8 










1671 1674 1678- 


1679 


1691-1692 










1703 1717 1727 


1731- 


■1732 1737 










1765 








«tdult heart 


GIB CO 


AHROOl 


4-8 10-11 15-16 


18-21 34-39 44- 










46 SO-52 57-58 


60 62-63 


71 75 82 










85 87 89 94 97 


100 103-104 108- 










XIO 112 114 116 


118- 


■119 


122-123 










127 130-132 134 


136- 


13 8 


141-144 










147-151 153 163 


-164 


168- 


'171 179 










186 192 195 197 


199 


204- 


-205 212- 










215 220 225-226 


229- 


230 


232 234- 










236 251 257-260 


262 


265 


272 274 










277 280-282 285 


-286 


289-292 296 










298-301 304 307 


309 


314 


321 324- 










325 330 333 336 


-338 


345 


349 351- 










352 354 358 361 


368 


370 


360 383- 










364 387-398 391 


393 


397 


401 406 










408-409 411-412 


414- 


416 


430-431 










433-439 445-446 


449 


452 


454-455 










457 459 462 469 


472- 


473 


476-480 










483-484 487-490 


492- 


493 


496-498 










503 506 SOB 510- 


-513 


516 


S19-S22 










526 534 536-540 


542 


546 


549 553 










560-562 574*377 


sei- 


582 


584 586- 










587 589 593 S9S 


597 


604- 


609 611- 










612 615-620 622- 


•623 


626 


632 637 










645-652 656-660 


665- 


666 


670-672 










674-675 683-684 


687 


692- 


694 697 J 










701 709 712 715- 


-716 


719- 


720 725- * 
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Tissue Origin 



adult kidney 



RMA Source 



Hyaeq 
Library Name 



GIBCO 



AKDOOl 



SEQ ID NOS: 



'726~'/28 730-732 7Jb 738-739 743-- 
744 746 751 7S3 7S9 76X 765 770- 
771 77S-780 785 788-790 796 802 
804 810 812 817 821 826 828 830 
837 843 845-847 849-853 857-861 
863-864 869 871 875 877-879 881 
883 887 890-892 854-895 897-898 
901 903 906-907 911-913 915 919 
921-925 927-928 933-935 945 958 
961-963 967 969-972 975 977-978 
980-986 990 992 999-1002 1005- 
1007 1010 1016 10X9-1020 1022- ' ■ 
1023 1025 1028-1037 1039-1040 
1043 1047 1050 1054-lOSS 1057 
1059 1063-1064 1067-1068 1070 
1072 1075-1076 1083 1O8S-1087 
1089 1093-1094 1104 1106 1108- 
1109 1113 1116-1117 1119 1121 
1124 1126 1128 1131-1134 1144- 
1145 1148-1149 1151 1156 1167 
1169-1170 1175 1177 1192 1196 
1199-X200 1202 1206-1208 1211 
1216 1218 1222 1227-1229 1232- 
1235 1238-1241 1243-1244 1247- 
1248 1250 1253-1254 1256-1258 
1261 1268 1270-1271 1277 1280- 
1282 1287 1292 1298-1299 1306 
1308 1317-1321 1324-1325 1330 
1332 1334-1337 1339 1344-1345 
1349-1350 1354-1356 1359-1360 
1365-1366 1369 1371 1374-1375 
1378-1380 1383-1384 1389 1397 
1400 1403 1409 1417 1423-1426 
1437 1439 1442 1444 1446-1447 
1450 1453 1468 1470 1473 1479 
1481 1488 1490 1501-1504 1519 
1521 1S24 1528 1530-1534 1536- 
1S37 1S39 1541-1542 1547 1553 
1555 1560 1565 1567-1S71 1588 
1591 1597-1598 lSOl-1602 1605 
1614-1616 1619-1620 X623-1628 
1630-1632 1634 1636 1641 1644- 
1645 1647 1649 1652-1655 16S9 
1662 1667 1673-1674 1680-1681 
1684 1686-1688 1704-1705 1709 
1711-1712 1717 1724 1726-1727 
1731-1733 1737-1738 1741 1743- 
1744 1749 1754-1755 1760-1761 
1765 1772 178S 



4-8 10-11 17-21 29-31 35-39 42- " 
45 50-51 56-56 60-61 64 68-69 75 
77 80 82 35 87 92-54 97 100 102- 
104 107-106 112 116-117 119 123 
127-133 136-137 139-141 143-144 
147-154 157 161-1$3 16S-166 169 
172 176 178-179 X92 194-197 199 
201 203-206 209-210 212-213 215- 
216 223-228 234-236 238 247 251- 
253 257-259 261-262 265-269 271- 
272 274 276-277 279-281 234-286 
290 293 29S 298-299 301-302 304 
307 311-313 321 325-326 329-331 
333 341 344 348-350 352 356 3S8- 
359 362 364-365 368 370-372 374 
376-377 380-382 392 395 39B 400- 
401 404 407-409 414-415 423-424 
430-437 443-444 446 449 451 453- 
455 459 461-462 464 467 469 471- 
474 476-477 480-481 483 487-488 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ XD NOS: 



490-491 493 497-SOS 
S20 S22 524 526-529 
S44 547 549 SS4-5S6 
567 571-576 578 582 
593 598-599 601 604- 
615-619 621-626 632- 
64S-6S2 655 660^664 
578-679 688 692-695 
713 717 719-720 727 
7.-^8 743 745-746 751 
763 765 771-773 775- 
788 793 79S-796 800 
810-812 814-819 821 
834-838 842-84S 848- 
864-865 867 869 871 
836-887 889-891 893- 
902 906-908 910-914 
925-927 929-935 937 
948-949 951 953-958 
964 969-970 972 976- 
988-990 992-993 995- 
1004-1008 1010 1012 
1017 1019-1020 1022 
1035 1038-1040 1042 
1050 1054-1055 1057- 
1070-1073 1078 1085- 
1089 1092 1094 1097 
1107 1109-1112 1116- 
1123-1125 1132-1135 
1143 1146-1147 1149- 
1154 1157 1159 1163 
1178-1179 1181 1183 
1200 1202-1204 1206- 
1219 1221-1222 1225 
1232-1234 1238-1241 
1246-1247 1253 1257- 
1261 1267-1268 1270 
1281 1283 1287-1239 
1299 1306 1308 1311- 
1320 1323 1329-1330 
1339 1341 1349-1350 
1359 1367 1369 1373 
1379 1394 1397 1400 
1407-1409 1417 1419 
1428-1431 1433 1437- 
X443 1445-1446 1448- 
1454 1459 1461 146S- 
1475 1478 . 1484-1486 
1493 1495 1497-1498 
1509 1512 1518 1521- 
1527-1528 1S32-1533 
1541 1547-1550 1552 
15S1 1565-1566 1568 
1578-1579 1S83 1566- 
1591-1592 1594 1598 
1604 1606 1608 1611 
1616 1618-1622 1624- 
1632 1634-1636 1638- 
1644 1646-1649 16S3- 
1664 1666-1667 1670* 
1679 1683-1684 1686 
1696-1699 1701 1709- 
1714 1716-1719 1723- 
1727 1733 1737-1738 
1744 1748-1749 17S1 
1763-1768 1776 1780 



S10-S13 516- 
534 537-540 
560 562 564 
586-S89 592- 
€06 6CB-613 
634 637-643 
669-672 676 
698 702 711 
731 735-736 
753 755 762- 
778 780 786 ^ 
803 805 808 
826 829 832 
855 857-861 
874 876-883 
896 898-900 
918 920 922 
940-942 945 
960-961 963- 
978 982-986 
997 999-1002 
1013 1016- 
1025-1031 
1044 1047 
1064 1068 
1086 1088- 
1099-1102 
1119 1121 
1140 1142- 
1150 1153- 
1167 1170 
1192 1196- 
1211 1216- 
1227-1230 
1243-1244 
1258 1260- 
1272-1274 
1293-1295 
1313 1317- 
1334-1335 
1353-1357 
1375 1378- 
1403 1405 
1423-1424 
1438 1442- 
1450 1453- 
1468 1474- 
1490 1492- 
1506-1S07 
1522 1S25 
1537 1540- 
1556-1559 
1571 1S7S 
1587 1589 
1600 1603- 
1613 1615- 
1628 1631- 
1639 1641 
1656 1662 
1671 1676- 
1691-1692 
1711 1713- 
1724 1726- 
1741 1743- 
1760-1761 
1785 



adult kidney 



Invitrogen 



AKT002 



20-21 37-39 47 52 S7 60 65-66 
68-69 80 104 107-108 122 130 133 
136-137 140 142-143 149 169 174 f 
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3NSCOCI0: <WO 0lS33iaA1 I > 



wo 01/53312 



PCT/l)S0(>/34263 



Tissue Origin 


RNA Source 


Hyseq 






SEQ ID NOS: 










Library Name 






















181 


197 


227 


-228 235 


-236 


244 


251 








261- 


265 


267 


280-381 


286 


290 


299 








301 


304 


-305 


309 312 


-313 


339 


341 








344- 


345 


349 


358 370 


-372 


376 


382- 








363 


307 


392 


401 414 


416 


421 


430 








443 


445 


449 


453-454 


472 


437 


-488 . 








504 


506 


513 


516 519 


522 


S28 


536- 








540 


546 


554 


585 587 


594 


598 


602 








607 


616 


-617 


626-627 


636 


643 


662-^ 








664 


695 


709 


721 735 


74 3 


7S1 


768 ^ 








77S- 


777 


788 


796 804 


814 


827 


837- - 








838 


849 


-850 


852-853 


869- 


870 


881 








890- 


692 


898 


903 905 


-907 


914 


919 








925 


927 


934 


941 949 


952 


957 


960 








962 


968 


970 


1000 1000 1029- 


1030 








1044 


1052 1055 1063 


1067 


-1068 










108S 1099-1102 


1107 


11 


lo- 








nil 


1113 1115 1119 


1X26 


1134 








1136 


-1137 1 


146-1148 


11S3 


1159 








1192 


1196 1199 1232 


-1233 


1241 








1256 


1264 1272-1273 


1281 


12 


ss 








1293 


-12 


34 1299 1312 


1320 


1324- 








1325 


1330 1344 1349 


1351 


13 


55- 








1356 


1365 1378-1379 


1403 


14 


14 








1419 


1428-1429 1436 


1446 


1458 








1463 


-1464 1467-1468 


1470 


14 


77- 








1478 


1486 1491 1509 


1519 


1527 








1529 


1534 1547 1596 


1600 


1619 








1623 


1629 1 


S31 1634 


1638 


1643 








1647 


1652 1660 1664 


1667 


1669- 








1670 


1673 1 


G8S 1709 


1727 


1740 








1776 


















.^VLwU KJ X 


4-8 


14 37-3 


9 44-46 


50-51 


56 


62- 








63 75 82 88 


93 103- 


104 113 


125 








133 


140 


143 


150 152 


154 


157 


162 








171- 


172 


174 


-175 190 


-1^1 


196 


200 








211 


214 


219 


223-224 


227- 


228 


251- 








252 


256 


265 


272 274 


280- 


281 


285 








310 


332 


345 


351 362 


371 


381 


-382 








394 


408- 


-409 


431 436 


445 


454 


459 








461 


467 


469 


471 476 


-477 


488 


S04 








513 


S27 


537 


-540 544 


547" 


548 


554 








564 


583 


607 


616-617 


621 


€23 


-624 








634 


645- 


-646 


662^664 


670 


695 


716 








719 


743- 


•744 


763 766 


774 


789 


803 








811 


814 


817 


831-832 


837^ 


338 


845 








'852- 


853 


858 


-859 861 


666 


880 


887 








901 


905 


941 


954-957 


966 


971 


977 








979 


981 


987 


990 992 


996 


100 


1 








1005 


-1006 1014 1017 


1045 


1047 








1054 


1059 1062 1064 


1072 


10 


30 








1086 


-1089 1094 1107 


1126 


1134 








1136 


-1137 1142 1150 


1157 


1173 








1190 


1200 1208 1220 


1241 


1272- 








1273 


1260 1282 129S 


1306 


1320 








1331 


-1332 1353 1374 


1379 


1383- 








1384 


1404 1409 1423 


1434 


1436 








1442 


1474 1478 1494 


1509 


1522 








1525 


1531-1532 1547 


1549 


1553- 








1554 


1571 1598 16C6 


1613 


1624 








1627 


-1629 i632 1642 


1644 


1662 








1S69 


1676-1677 1684 


1696 


1727 








1731 


1732 1737-1738 


1748 


-1749 








1786 














lymph node 


Clontech 


ALNOOl 


4 24 


50- 


51 82 105 137 153 198 








201 


223- 


224 


234 268- 


-269 


272 


280- 








231 


287 


301 


312 329 


343 


382 


421 ^ 








430 433 


445 


451 461- 


-462 


47S 


481- ▼ 








482 503 


526 


529 537- 


-540 


546- 


-547 
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A 



wo 01/53312 



PCT/USOO/34263 



Tissue Origin 



young liver 



adult liver 



RNA Source 



GIBCO 



Hyseq 
Library Name 



SEQ ID NOS: 



ALVOOl 



xnvitrogen 



ALV002 



621 626 649 679 719 72S-726 738 " 
793 803 631 834-836 838 844 857- 
8S8 866 879 905 Sl3 928 563 976 
lOOS-1006 1012 1038 1050 1116- 
X117 llSl 1199 1204 1226 1243 
1265 1274 1324-1325 1339 1353 
1440-1441 1447 1504 
1618-1619 1631 1641 
1687-1688 1691-1692 



1374 
1S4 9 
1644 
1741 



1377 
1600 
1653 
1771 



717 
766 
814 
893 
919 
970 



5-8 11 2U-21 46 SO-51 58 65-66~ 
75 79 82 93 97 102-103 108 110 
116 139 143-144 148-149 171-172 
174 187-189 194-195 198 209 214 
215 230 250 258 267-269 280-281 
306 309 342 351 356 359 362 372 
374 392 394 398 401 407-408 4X0 
414 431 444 4SS 4S9 476 470 483 
493 510-512 516 520 522 526 536 
549 571 574-577 S8S 592 601-602 
607 621-524 628-630 632-633 637 
648 660 666-667 678 697-698 700 
719 728 730 734 738 744-745 
770 773 779 788 800 80S 812 
841 849-851 871 874 879 887 
898-900 902-904 906-907 911 
922 924 934 9S3 957 963 965 
984 986 997 lOOl 1004 1007 
1012 1029-1030 1033-1034 10S2 
1061 1066 1070 1076 1086 1089 
1093 1099-1102 1110-1112 1116- 
1117 1119 1121 1125 1136-1137 
1144-1145 11S6-1157 1159 1196 
1199-1200 1209 1211 1219-1220 
1241 1244 1262 1270 1275 1279 
1283 1295 1317-1320 1332 1339 
1344 1359 1362-1363 1379 1383- 
1384 1403 1415 1430-1431 1437 
1450 1467 1475-1476 1483-1484 
1434-1495 1498 1505 1512 1516 
1518-1519 1526 1529 1547 ISSO- 
1552 1557-1559 1565 1583 1S87 
1597 1609 1614 1620 1631 1637 
1641 1644 1654-1655 1662 1667 
1669 1684 1691-1692 1702 1711 
1725 1738 1741 1743-1744 1758 
1760-1761 1763-1765 1769 
20 



S-8 17 20-21 32-33 41 55 58" 64 

75 77 86 89 102 108 117 119 175- 
176 198 200 209 231 235-236 250 
272 275-276 284 306 31$ 321 325 
333 356 359 374 376 398 401 408 
414 428 430 433-435 454 476 494 
503-SOS S17-S18 528 534 544 5S2 
561-563 567 578 581 608-609 630 
632 637 644 650 661 665 672 702 
707 710 721-722 750 753 778 782 
794 814 820 826 834-837 847 849- 
850 658 861 874 879 893 898 904 
911 918 921-922 926 946 948 972 
978 986 996 1020 1027 1031 1034 
1053 1063 1068 1070 1073 1086 
1089 1093 1097 1113 1119 1156 
1159 1195 1198-1199 1208 1220 
1227 1^41 1261 1272-1273 1277 
1285 1308 1315 1320 1324-1325 
1330 1362-1363 1375 1403 1408- 
1409 1415 1431-1432 1435 1467 
1469 1482 1504 1524 1542 1547 
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wo 01/53312 



PCT/USOO/34263 



Tissue Origin 



adult liver 
adult ovary 



RNA Source 



Clontech 
Invicrogen 



Hyseq 
Library Name 



AIiV003 



SEQ ID NOS: 



1550 1567 1578 1581 1583 1594 

1597 1601-1602 1611-1612 1615 

1618-X619 1621 1625 1637 164S 
1647 1652 1654-16S5 1660 1666 

1669-1671 1684 1706 1722 1737- 

1738 1742-1744 1760-1761 1753- 
1765 1772 1774 



29 676 997 1063 1119 1536 1766" 



1 4-lB ^U-23 29 35-40 42-48 SO- 

51 53-58 61-63 65-66 6S-69 73-75 ; 
77-78 80 82 85 87 89 97 lOO-lQi 
103-104 106-108 110 113 115 118 
122-124 126 128 133-134 136-140 
142 145-147 149-157 161 166 168- 
170 174 177-173 180 182-186 188- 
189 192-203 207 209 211-215 219 
221-224 229-230 234 242-243 246- 
247 2SS 258 260-262 265-269 271- 
272 274 277-281 284-286 288 290 
295 299 301-302 304 307 309-311 
313-314 316 321 323-326 330 332- 
333 335-338 341 344 349 352-353 
356 358 360 362 370-372 376-377 
379-384 387 390-392 394 397-398 
400 -303 408-410 412 414-416 423- 
424 426-427 430-435 439 443-446 
448-449 451 453-455 462-463 468- 
471 473 476-479 481-484 487 489- 
494 496-497 499-501 S03-S0S S09- 
514 516-517 519-520 522 524 526 
528-534 541-544 546-547 549 5S2 
5S4-5SS 561-564 566-567 569-570 
572-573 575-576 579 581 S03 585- 
588 590-591 593 595 597 599 601- 
605 607-613 €1B 618-622 624-627 
630 632-633 636-640 642 644-647 
€49-652 654-655 657-665 667-675 
677-578 681 683-684 692-695 697- 
710 714-721 723 725-727 729 732 
734-735 743-746 750-751 753 758 
763 765 767 772-773 775-778 780 
783-784 786 788 790-791 794-796 
800 803 805.809-811 813-815 818- 
819 821-824 826 828-829 831-832 
837-838 843-850 SS2-S57 859-864 
867 869 871-872 874-875 878-883 
887-888 890-895 898-910 912-914 
916 919-922 924 926-927 929-939 
941 943-946 946-951 953 935-958 
961-964 966-967 970-979 981-982 
985-986 988-990 992 995-997 999- 
1001 1004-1009 1011-1013 1016 
1019-1020 1024-1025 1029-1031 
1033-1035 1037 1039 1041-1047 
1050-1051 1054-1060 1062-1064 
1067-1070 1072-1073 107S-1O76 
1078-1079 1085-1086 1089-1090 
X0$4-10$6 1098-1103 11C6-1108 
1112-1117 1119-1120 1123-1127 
1131-1135 1142-1143 1146-1149 
1153 use 1158 1163 1165-1166 
1169-1171 1173-1175 1177-1178 
1180 1183-118S 1190-1191 119S 
1197-1200 1202 120S-1214 1217- 
1219 1221-1226 1232-1235 1238- 
1241 1243-1244 1247 1249 1252- 
1254 1256-1258 1262 1265 1267- 
-1268 1270 1275 1278 1280-1283 , 
1286-1289 1291 1293-1294 1298- 
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Tissue Origin 



adult placenta" 



RNA Source 



Clontech 



Hyseq 
Library Name 



SEQ ID NOS: 



129S» 

1323 

1338- 

13S9 

1377- 

1394 

1427 

1443 

1453- 

1481 

14 94 

1507 

1526 

1538- 

1553 

1567 

1578 

1S91 

1609 

1636 

1657 

1671 

1690 

1713 

1726 

1738 

1751 

1765 

1778 



1306 
1327 
1335 
1361 

■1379 
1400 
142S- 
1445- 

■1464 
1484- 
1496- 
1511- 

■1527 

■1539 
1555- 
1S69- 
1580- 
1S9S 
1611- 
1638 
1G59' 
1S73- 
1639 

-1714 

-1728 
1740' 
1753 
1767 

-1779 



1308 
1329- 
1341 
1365- 
1383- 
1404 
1431 
1450 
1466 
1485 
1498 
1517 
1530- 
1541 
1559 
1570 
1581 
1597- 
1621 
1641 
1662 
■1674 
1702- 
1716- 
1731- 
-1741 
1755- 
•1768 
1783 



1312 
1330 
1343- 
1366 
1384 
1416- 
1435- 
1453- 
1468 
1488 
1501- 
1519 
1S31 
1546 
1561- 
1572 
1587 
■1598 
1623 
1643 
1664 
1676 
■1707 
-1719 
-1733 
1743 
-17S6 
1770 
-1784 



1317- 
1332- 
1351 
1371- 
1386 
1417 
1436 
1454 
1470 
1491 
1504 
1521- 
1534- 
1548- 
1563 
1574- 
-1588 
1600- 
■1630 
1645 
1667 
-1681 
1710 
1723 
1735 
-1744 
1760 
-1771 
1786 



1321 
1333 
1356 
1375 
1389 
1422- 
1439- 
1459 
1474- 
1493- 
1506- 
1524 
1536 
1550 
1566- 
1575 
1590- 
•1606 
1634 
1647- 
1669- 
1663- 
-1711 
-1724 
1737- 
1748- 
-1762 
1776 



5-8 44-45 90-91 107-108 159 178 
311 351 414 476 503 545 574 624 
636 7X9 755 773 860 890-891 924 
947 955-956 962 990 392 1002 
1045 1202 1320 1369 1628 1686 
1713-1714 1743-1744 



14-i6 26 29 43 60-61 79-80 i03 
106 116 135 171 177 180 194 196 
198 210 216 235-236 272 290 299 
309 329 334 339 359 379-380 417 
423 430 434-435 448 454 483 490™ 
491 517 522 631 723 725-726 728 
738 746 769 818 843 854-855 857- 
858 916 948 953-954 976 588-989 
1005-1006 1013 1033 1036 1064 
1068 1070 1086 1139 1144-1145 
1160 1277 1285 1317-1320 1343 
1345 1429 1435 1438 1454 1482 
1486 1490 1512 1519 1532 1549 
1592-1593 1602 1626 1647 1649 
1564 1673 1675 1722 1727 1730 
1746 1776 



placenta 



Invitrogen 



adult spleen 



GIBCO 



ASPOOl 



3 S-8 12 15- 
44-45 57 60 
103 106 108 
147 152-153 
17B-1B0 196 
215 219 234 
272 280-261 
325 333 341 
387 394 406 
448 451 473 
505 517 519 
SS4 557 574- 
611-612 620- 
652 659 661 
700 721 728 
746 762 765 
810-811 817 
852-8S3 858 



16 19-21 24 
82-83 87 89 
117 119-121 
155 166 169 
198 201-206 
253-2S4 256 
290 295 302 
349 358 372 
414 431 434 
481 490-493 
530 534 536 
576 582 592 
621 623 631 
667 671 673 
730 732 738 
774 78Q 788 
822 830 832 
862 866 874 



29 34-36 
94 98-99 
139 141 
171 174 
209-211 
258 264 
309 312 
382 386- 

-436 446 
500 503 

-S40 547 
595 604 

-632 642 

-675 684 
742-744 

-789 794 
845 848 
879 882 
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PCT/USOO/34263 



Tissue Origin 



Genomic DNA 
from EAC 63118 



Genomic DMA 
from BAC 35316 



RNA Source 



GIB CO 



Research 
Genetics 
(CITB BAC 
Library) 



Research 
Genetics 
(CITB BAC 
Library) 



Hyseq 
Library Name 



ATSOOl 



B AGO 01^ 



BAC002 



116 



SEQ ID NOS: 



84 906-908 912 919 921-923 926- 
927 934 942 949 957-9S8 963 977- 
978 983 990 992-994 996-997 99B 
1005-1007 1010 1012 1031 
1046 1049 1059 
108£)-1090 1094 
1115 1124 1140 



1042-1044 
1070 1076 



1109 
1170 



1112 



1036 
106S 
1X03 
1163 



1174 1177 1190 1196 1219- 



1220 1226-1227 1229 1236 
1246 1258 1269 1271 1274 



1241 
1295 



1301 1320 1322 1330 1334-1335 
1339 1349 1351 13S3 1359-1360 



1364 1369 1374 1386 
1417 1434 1436-1437 



1397 
1439 



1413 
1468 



1474 1477 1480 1485-1487 1498 
1S12 1522 1525 1544-1549 1553 
1560 1567 1591 1600 1631 1636 
1651 1654-1655 1658 1662 1670 
1674 1678-1679 1684 1686 1700 
1727 1733 1738 1740-1741 1760- 

1761 1774 1779 1781-1782 ^_ 

5-8 io 26 30-31 47 50-51 57 58-" 
69 82 84-85 97 102 113 119 137 
139 150 152 154 156 163 169 174 
176-177 192 194 196-197 212-215 
227-228 247 255 258 261 282 285 
288-299 301 307 311 316 330 334 
349 370-372 392 398 410 415 426- 
427 430-431 433 437 446 454 461 
469 473 477 481-482 493 499 502- 
503 513 522 526 547 5S2-SS3 563- 
SC4 572-573 575-576 581-582 585 
599-602 605 612 615-617 620 631 
637 647 649-650 656 660 665 670 
674-675 712 719-721 723 728 731 
738 744 746 773 780 784 78.8-789 
802 304 809 811 814 826 831 837 
843 845 848 859 866 869 877 905 
913 916 919 921 926 929 937 9S0 
960 963 971 975 977 981 990 992- 
993 1007 1016 1029-1030 1034- 
1035 1038-1039 1045 1059-1060 
1064 1070 1072-1073 1087 1089 
1097 1099-1102 1104 1108 1113 
1141 1149 1161-1162 1175 1208- 
1209 1222 1227 1229 1231 1235 
1238-1239 1243 1253 1285 1287- 
1289 1291-1293 1307 1311 1317- 
1320 1330 1332 1338 134S 1369 
1373-1374 1379 1389 1399-1400 
1409 1423-1424 1430 1435-1437 
1443 1459 1484 1486 1490 1493 
1496-1497 1501 1505 1509-1513 
1527 1530-1531 1533 1537 1546 
1549 1553 156S 1567 1569 1571 
1577 1586 1591 1599 1602 1625 
1628 1630-1632 1636 1639 1642 
1649 1661-1662 1666-1667 1670 
1675*1684 1690 1699 1705 1712 
1717 1724 1730 1737-1738 1752 
1767 1779 



686 1352 1412 



1411-1412 



A 



wo 01/53312 



PCT/USOO/34263 



Tissue Origin 



Genomic DNA 
firom BAC 3931^ 



adult bladder 



RNA Source 



Research 
Genecics 

{CITB BAC 
Library) 



Invitrogen 



Hyseq 
Library N<ime 
BAC003 



BLDOOl 



bone marrow 



Clontech 



SEQ ID NOS: 



1352 



S-8 17-18 22~2^ 33 37-39 S6-57 
80 93 100 120-121 169 201 237 
251-252 272 278 311 348 353 382 
413 415 424 430 443 483 502 542- 
543 562 564 607 616-617 626 635 
652 667 671 710 727 755-^736 762 
773 786 788 837 840 866 893 898 
909 918 929 966 977 983 1016 
1025 1055 1073 1082 1140 1167 
lias 1189 1199 1270 1369 1481 
1536 1560 1573 1596 1614 1636- 
1637 1649-1650 16S4-165S 16S8 
1669 1671 1690 1719 1727 1731- 
^732 1739 1741 1760-1761 1779 



3-8 11 13 18 29-31 33 35-36 40 

43-45 47-48 50-51 57 60 65-66 75 
80 82 85 88-89 94 100 103 107 
110 US 118-119 124-125 133-134 
136-137 139-141 146 150 152-153 
155 161 163 168-170 172 178-160 
187 192-193 197-198 203-205 210- 
213 21S 217 219 222 224-226 233 
23S-237 242-244 255 258 260 263- 
264 266 273 276 278 283 286 290 
29S 301-302 307 312-313 321 330 
333 339 343 352 357-358 370-37X 
382 384-385 387 389 394 408 410 
412 416 421 424-427 429-431 436- 
437 439 441-442 445 447 454-456 
461-462 471-472 475 477-479 481- 
482 485 488 493 498 500 503-506 
513 S16 519 523-524 526 530 535- 
540 542 544-545 549 SSS 565 567 
569-577 581 583-586 588 S93 601 
603-604 608-609 613-619 621-622 
632-633 636-637 642 649-6S0 656- 
660 666 670 672 674-675 679 683 
701 708 716 718-720 731 735-736 
740-742 744-745 7S2 761 765 772- 
773 775-778 730 785-786 789-791 
796 798 802 810-812 823-824 826 
830 832-833 837-838 843-844 848- 
855 858-859 866-867 869 878-880 
883 890-892 896 903 90S 908 912- 
914 922-924 927 930-931 937 939- 
941 952-9S3 955-958 963 969 973 
976 581 9SS 987 990 992 995 1000 
1002 1005-1007 1013 1016 1025 
1028-1031 1033 1035 1037 1039 
1042 1044 1047 1050 1053-1054 
1059 1061 1063 1066 1070-1071 
1079 1106 1110-1113 lllS-1117 
1124 1126 1134-1135 1142 1144- 
1145 1163 1172 1178 1197 1199- 
1200 1202 1216-1217 1224 1227- 
1228 1240 1246 1254 1261 1266 
1270 1278 1281 1295 1287 1290- 
1291 1293 1299-1301 1308 1314 
1317-1320 1327 1331 1339 1343 
1346 1349 1353 1356 1361 1367 
1369 1372-1374 1379-1380 1394 
1400 1403 1406 1408 1413 1417 
1419 1423 1425-1427 1430-1431 
1433 1439 1443 1446-1449 14S9 
1463-1464 1482 1486 1493-1494 
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wo 01/53312 



PCT/USOO/34263 



Tissue Origin 



RNA Source 



Clontech 



Hyseq 
Library Name 



SEQ ro WOS: 



1S06 

1526 

1546 

1557- 

1592 

1626- 

1638- 

1653- 

1684 

17X3- 

1727 

1772 



isoy 

1S28 

1548' 

1559 

1597- 

1628 

1635 

165S 

1686 

1714 

1737- 

1781- 



1513 
1531 
1549 
1571 
1600 
1630 
1641 
1661 
1690 
1717 
173 8 
1702 



152 1"- 
1536- 
1SS2 

-1572 
1609 

•1632 
1646- 

-1662 
1702 
1720 
1740 
1785- 



1S22 

1537 

1554- 

1581 

1614 

1634 

1647 

1676- 

1707 

1722- 

1756 

1786 



1524 

1543 

155S 

1589- 

1621 

1636 

1651 

1681 

1711 

1723 

1767 



bone marrov/ 



bone tnarrov 



bone marrow 



adult colon 



11 15-16 19 30-31 35-36 68-69 75 
83-S4 93 99 103 108-109 118 137 
139 169-170 174 177 180 190 193 
212-213 219 222 225-226 232 237 
255 259 264 273-274 284 286 290- 
292 295 301 303-304 307 312-313 
316 324 326 330 334-335 348 352- 
353 357 360 370-373 384 386-387 
397 403-404 414-416 421 425-427 
429-430 433-436 440 444 451 454 
465-466 472 475 478 491 493 516 
520 523 525 S31 545 548 552 566 
569-570 581 583 590-591 597-598 
601 616-617 621 641 650 652 656 
659 671 674-675 679 684 710 718- 
719 728 734 737-738 742 761 765 
774-778 790 811 814 818 830 834- 
836 854-855 859 8S6 869 87X 878- 
879 884 889 892 904 922-923 932 
990 992 998 1001 1004 1016 1036 
1042 1048 1051 1054-1055 1058 
1088-1089 X1Q6 1112-1114 1155 
1157 1192 1200 1223 1227-1228 
1236-1237 1260-1251 1282-1283 
1285 1287 1295 1314 1317-1321 
1324-1327 1330 1333 1341 1343 
1347 1350 13S3 1355-1357 1367 
1369-1370 1373 1377 1379 1381 
1383-1384 1394 1397 1400 1406 
1413 1417 1425-1427 1438 1442 
1446 1459-1460 1470 1493 1505 
1S21 1536 1546-1549 1560 1573- 
1574 1678 1598-1600 1621 1626 
1631 1634 1646 1649 1653 1656 
1658 1669-1670 1683-1684 1687- 
1688 1690-1693 1696 1699 1702 
1704 1707-1709 1711 1720 1722- 
1723 1725 1727 1729 1731-1733 
1738-1740 1743-1746 1752 1755 
1760-1761 1767 1777 1781-1782 
1786 



Clontech 
Clontech 



BMD004 



73-74 503 922 1036 1711 



BMD007 



95-96 866 1320 1475 



Invitrogen 



17 56-58 103 110 117 144 150 171 
179 185 188-189 201 204-206 210 
218-221 225-226 231 237 251 277 
288 310 312 320 333 3S9 386 388 
394 408 420 455 481 485 503 510- 
512 590-591 615 635 647-648 665 
672 684 697 710 725-726 743 780 
786 788 826-827 848-850 854-855 
858 866 872 898 918 921-923 953 
976 983 993 1005-1006 1017 1020 
1025 1027 10S4-10SS 1063 1068- 
1069 1140 1153 1170 1185 1196 
1199 1220 1280 1314-1315 1320 
1345 1351 1355 1369 1428 1439 
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A 
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Tissue Origin 



Mixture of 16 
tissues 
mRNAs 



Mixture of 16 
tissues ■ 
mRNAs' 



adult cervix 



RNA Source 



Various 
Vendors 



Hyseq 
Library Name 



Various 
Vendors 



RioChain 



CTL016 



"CTLOai 



SEQ ID NOS: 



"1462-1464 1512 1556 1S83 ISST" 
1594 lS9e 1614 1625-1626 163X 
1639 1645 1650 1675-1677 1S87- 
1688 1701 1713-1714 1724 1740 
1765 



'401 1490 1685 



312 78;=! X132-1133 1403 1712 1715 



326 
362 



^^^^^ ri 4-8 11 13 18^2l" 25-26 30^ 33-33 
37-39 43 46-47 58 61 64-66 71 
73-74 82 85 54 100 103^104 113 
118 122 126 130 134 140 147 J.53- 
156 163 170 179 181 186 192 195- 
-"^ 198 201-202 218-219 222 229 
231 257 266 27€-277 285-^286 288 
298 301-302 304 307 312-314 324 
339-330 332 335 342 352 358 
371-372 376 379 381-382 384 
388 398 400 410 414 41ff 419-420 
426-427 430-431 433-436 439 446 
448 461-462 464 471-477 479 482 
483 491 493 496 503 506 510-513 
516-517 526 530 S35 542-544 546- 
547 557 sex 572-573 575*577 581 
582 S85-SS6 586-589 593-594 600 
602 604-605 607-609 612 615-619 
S23 644 650 ^54 657-658 662-665 
670 672 680 683 691-694 698 706 
708-709 711 713 720-721 727 729 
731-732 737 745-747 753-754 760 
765 771 774-777 780 790 793 796 
798 800 803 SOS 818 826 828 831- 
832 834-836 843 847-848 8SI-8SS 
857-860 864-866 869 871 876 878- 
880 882 887 890-891 897 899-902 
905-908 912-913 916 918-919 922 
927 932 934-938 944 948 955-956 
958 963-964 967 969^970 912 976 
978-979 983 985 990 992 1000 
1005-1007 1016-1017 1024 1027 
1033 1036 1038 1045 1047 1053- 
1056 1066-1067 1071 1073 1075 
1079 1082 1098 1113 1124 1129 
1134 1139 1146-1149 1163 1167 
1170 1173 1175 1177 1181 1197 
1200 1202 1211 1214 1216 1221- 
1222 1225 1227 1232-1234 1240- 
1241 1243 1258 1264-1265 1268 
1270 1279 1287-1290 1308 1310- 
1311 1316 1320 1323 1327 1345 
1349 1353-13S4 1360 1372-1374 
1383-1384 1386 1394 1397 I4OS- 



n?MA A . ^^^^ """^ ^"^^^^^ "^"^'^^ ^ ^^"^^^^ ^) ^^^^^ adult brain 
St LtZn ' II f-'^}^<^^y n^RNA (fnvirmgen), 3> normal aduir liver 
mRNA nvi rogen , 4 normal fetal brain mRNA (Invitrogen), 5) normal fetal kidney 
raRNA On^m^n). 6) nonnal feial liver mRNA (Invitrogen), 7) normal fetal skin mRNA 

StS ^ "^"^ ^^^^ (Clontech), 15) human esophagus mRNA 

(BioCham), 16) human conceptional umbilical cord mRNA (BioChain) 
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A 



3NSOOCID: <WO 0153312A1 I _> 



WO(H/53312 



PCT/US(HI/34263 



Tissue Origin 


RNA Source 


Hyseq 






SEQ 


ID NOS: 








Library Name 




















140 


6 14 


16 1425- 


1427 


143 


1 1436- "~ 








143 


7 1442 1446 


1448 


1453 14S9 








146 


6 14 


72 1478 


1482 


1496 1501- 








1S03 1506 1512 


1522 


1527-1528 








153 


1 1533 1541 


1547 


1569 1571 








158S 1589 1597- 


1598 


1600 1608- 








160 


9 1614-1616 


1620 


1623-1624 








162 


6-1628 1630 


1638 


1641 1643 








164 


9 1653 1656 


1662 


1667 1669 








1674-1675 1683 


1685 


-1688 1699 ; 








1702 1709-1710 


1715 


1717 1722 








1724 1729 1731- 


1732 


1735-1739 








174 


1 1743-1744 


1748 


-1749 1755 








1760-1762 1767 


1773 


1778 1785- 








178 


S 










diaphragrm 


BioChain 


DIA002 


137 


282 


289 730 


780 


986 


1409 








1478 1559 1614 








endot helical 


Strategerie 


EDTOOl 


3 5 


-10 


13 15-21 


24- 


26 29 34 37- 


cells 






39 42 44-4S 50- 


51 53-55 


57-58 








60-61 65-66 68- 


69 73-74 


77-78 80 








82- 


33 85 87 89 


S3-.9 


S 101-105 108 








1.10 


112 


-114 116 


iia 


-122 


124 128 








X3 3 ■ 


-134 


137-142 


147 


-iSO 


152-1S3 . 








X61 


-163 


166-172 


176 


-179 


187 190 








1S2 


194 


196-201 


204 


-207 


210 212- 








214 


220 


224 229 


-230 


233 


235-236 








24 Q - 


-241 


251-252 


258 


261- 


-262 265 








267- 


-269 


272 276 


-277 


279- 


-281 284- 








285 


288 


290 295 


-296 


301* 


•302 310- 








311 


313 


316 321 


325 




331-333 








33S 


340 


-342 351 


-355 


360 


o t X ^ /3 








380- 


362 


364 387 


390 


392 


397 400 








407- 


-406 


410 412 


414 


416 


425-427 








431 


434 


-436 439 


444 


-445 


449 454 








463- 


464 


472-475 


477 


-479 


486 488- 








490 


497 


-498 500 


-504 


510- 


-513 516- 








519 


522 


524 526 


-528 


532- 


534 536- 








540 


542 


548 


S61 


-563 


566-567 








572- 


576 


579 581 


S8S- 


"586 


589 593 








S9S 


597 


599 603 


607 


-612 


615-617 








620 


622 


626 630 


632- 


•634 


638-641 








644 


647 


656-660 


662- 


-664 


670 673 








678 


680- 


•682 692 


-697 


707 


709-710 








712- 


713 


719 730 


732 


734 


736 738 








743- 


74 6 


751 759 


768 


771 


773 775- 








778. 


783 


786-789 


793 


800 


8C3 805- 








807 


810- 


•811 614 


816- 


818 


821-822 








824 


826 


828-829 


832 


834- 


838 842- 








845 


848- 


-850 854- 


-860 


862 


864 069 








871 


874 


876-879 


883 


665 


887 890- 








891 


894- 


895 898- 


■900 


903 


908 910- 








9X3 


916 


919-922 


924 


926- 


928 930- 








93S 


939 


943 948- 


•949 


951- 


954 957 








959- 


961 


964 969- 


-970 


973 


975-978 








983- 


904 


988-990 


992- 


993 










1000 


1002 1004-1013 


1016 


-1020 








1022 


-1025 1028 1031 


1033 


-1034 








103B 


-1046 1050 lOSS- 


1056 


1059- 








1060 


1062-1064 1067- 


1070 


1072- 








1074 


1076 1078 1082 


1Q8€ 


-1087 








1089 


-1090 1093-1097 


1099 


-1103 








1107 


1109-1113 1116- 


1117 


1124- 








1126 


1128-1131 1134- 


1135 


1138 








1140 


ai44-1145 1148- 


1149 


1153 








1157 


1160 1163 1 


171 


1183 


-1184 








1198 


-1199 1202 120S- 


1207 


1211 








1216 


"1217 1219 1221 


122S 


1229 








1232 


-123 


5 1238-1241 


1243 


-1244 i 








1246 


1250 1253 1257- 


1258 


1261 
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Tissue Origin 


RNA Source 


Hysecj 
Library Name 




SEQ 


ID NOS: 












1266 


1268 


12 70 


-1271 


12 


74- 








1277 1280 


-1283 


12 85 


-1286 


1288" 








1290 1293 


1295 


12 98 


1308 


13 


12 








1317-132G 


1324- 


13 2 5 


132 7 


13 


29- 








1330 1334 


-1335 


13 3 8 


1342 


-1343 








1345-1347 


13S0 


13 55 


-1356 


13 


59 








1367 13^9 


1374 


1376 


1379 


13 


98 








1400 1406 


1408 


14 14 


1417 


14 


19 








1424-1426 


1428- 


1431 


1434 


-14 


38 








1440-1442 


1448 


1450 


1462 


-14 


66 








1468 1472 


1474 


1478 


1487 


-14 


80 








1491-1493 


1501- 


1504 


1506 


1509 








1511 1516 


1520- 


1521 


1526 


1529 








1531 1536 


-1537 


1539 


-1540 


1546- 








1547 1549 


1552 


1555 


1557 


-1559 








1561-1565 


1568 


1571 


1575 


1578- 








1579 1581 


-1583 


1587 


-1588 


1590 








1592 1597 


1605- 


1606 


1611 


1613 








1615 1618 


-1621 


1624 


-1628 


1630- 








1631 1634 


1636 


1638 


1641 


1643- 








1650 1652 


-1659 


1664 


1666 


-1667 








1669 1671 


1675- 


1681 


1683 


-1688 








1696-1698 


1703 


1711 


1715 


-17 


16 








1719 1722 


"1723 


1726 


1731 


-1733 








1736 1739 


-1741 


1743 


-1744 


1749 








175S 1760 


-1761 


1765 


1767 


-17G8 








1771-1773 


1776 


1779 


1783 


-1786 


Genomic clones 


Genomic DNA 




286 686 1297 13 


03-1304 1 


352 




f ironi the short 


from 




1411-1412 


1754 










arrm of 


















chromosome 8 


Research 
















eeopha9U6 


B i oChci in 


ESO0O2 


131-132 261 289 


380 


503 


BGO 


092 








1000 1007 


1397 










£et:a2 Jbrain 


CXonbech 


FBROOl 


62-63 89 112 12S 194 322 


336-338 








379 391 411 481 


546 


563 


607 


679 








VIO 867 1012 1031 1055 1251 


1262 








1320 1407 


1643 


1652 


1686 


1731- 








1732 1746 


1765 










fetal brain 


Clontech 


FBR004 


68-69 90-91 139 


212- 


■213 


301 


331 








362 374 403 436 


611 


645- 


S46 


€59 








668 670 631 785 


805 


845 


X163 








1209 1216 


1232- 


1233 


1238 


-1239 








1387 1410 


1416 


1430 


1496 


1536 








1S47 1S93 












fetal brain 


Clontech 




5-9 25 43 


60 62 


-63 65-66 


70 


72 








QO 87 92 101 103 108 114 


136 139 








149 152-153 157 


166 


171-172 


175 








207-208 210 212 


-213 


221-226 


237- 








238 251~2S3 266 


272 


279-281 


295 








301-302 307 310 


317- 


318 321- 


-324 








330 333-334 336 


-338 


346-347 


352 








357 370 373 377 


379- 


380 382 


384 








391-392 397 399 


402 


406-408 


410- 








411 417 421 424 


426- 


427 430 


436- 








437 440-443 454 


460 


464 467 


473 








476 483 488-489 


49S 


497 508 


510- 








513 516 519-520 


524 


530 537- 


-540 








544 547 550 561 


567 


572-574 


582 








590-591 59S 597 


604 


607-609 


615 








623 628-629 631 


634 


638-640 


655 








657-658 eCQ 665 


669 


674-675 


679 








689 691-694 696 


-697 


699 701 


706 








710 716 720 728 


732 


734 736 


742- 








744 757-760 763 


775- 


778 780 


799 








806-807 810 817 


-818 


826 839 


843 








858 861 664 871 


-872 


884 890- 


891 








094-895 89£J 904 


915 


921-923 


935- 








936 938 945 950 


9S2 


9S5-9S6 


958- . 








959 961 953 967 


969- 


971 990 


992 ^ 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



fetal brain 
fetal brain 



999 1001 
1016 1022 
103S 1042 
106S 1067 
1114-1115 
1151 1153 
1172-1} 73 
1190-1200 
1226-1227 
1253-1255 
1270-1273 
1314 1317 
1339 1341 
1371 1373 
1386 1392 
1425-1426 
1440-1441 
1502-1503 
1519 1536 
1559 1S73 
lSll-1614 
1S40 1051 
1693 1696 
17X8 1720 
1730-1733 
1742 1745 
1767 1771 
1786 



lOOS- 
1024 
1047 
1070 
1119 
1156 
1178 
1211 
1229 
12SS 
1281 
1320 
1344 
1376 
1396 
1426 
1448 
1507 
1544 
1589 
1619 
16S7 
1703 
1722 
1735 
1755 
1772 



1006 
1029 
1048 
1082 
1131 
1150 
1184 
1216 
1231 
1260 
1287 
1326 
1350 
1379 
-1398 
-1429 
1466 
1511 
1549 
-1B90 
1621 
-1658 
1704 
1724 
-1736 
1759 
1777 



1008 1013 
1030 1032 
1052 1056 
1089 1109 
1143-1149 
1163 1167 
1186 1188 
1222-1223 
1236 1245 
1262 1266 
1308-1309 
1334-1335 
1356 1369- 
1381-1382 
1419 1423 
1432 1437 
1470 1482 
1513 1516 
1550 1557- 
1598 1608 
1625-1626 
1676-1679 
1713-1714 
1726 1728 
1738-1739 
1761 1765 
1779-1780 



Clontech 



FBRS03 



235-236 520 864 1068 1188 1587 



Invitrogen 



15-18 20-21 24-25 29 34 43 61-63 
77-78 98 101 103 107-108 128 130 
136 146 148 165-1S6 171 174 181 
IBS 196-198 204-205 208 223 230 
235-236 251 253 2SX 268-269 280- 
281 284-285 288 309-311 321 329 
334 339 346-347 350 357-359 381- 
383 390 407 418-419 430 434-435 
438 443-444 461 464-466 483 490 
494 509 516 519 522 527 557 561- 
562 572-573 590-591 595 597 623 
632 647-648 650 655 665-670 672 
682 690-691 700-701 710 717 736 
746 782 784 .788-789 814-815 825 
829 840-841 847 854-855 857-858 
897-900 904 919 925 935-937 946 
948-949 954 960-962 966 969-970 
986 996 1000-1001 1005-1007 1012 
1014 1022-1028 1045 1052 1055 
1068 1070 1072 1078 1082 108S 
1090 1109 lllS 1118 1120*1128 
1136-1137 1144-1145 1149 1156- 
1157 1193-1195 1198 1204-1205 
1220 1222 1234 12S7 1262 1271 
1274-1275 1280 1285-1286 1294 
1312 1314 1317-1320 1330 1342 
1344-1345 1349-1350 1355-1356 
1358 1364 1369 1379 1383-1384 
1431 1435 1476 1S07 1519 1532 
1536 1547 1554 1564 1567 1578 
1582 1587 1593 159S 1601 1608 
1615 1619-1621 1638 1644 1661 
1665-1666 1673 1687-1688 1690 
1715 1723 1728 1749 1753 17S7 
1759-1761 1765 1771 1774 1776 
1778 1781-1782 1786 



fetal heart 
fetal kidney 



Invicrogen 



FHROOl 



105 124 160 289 864 1036 1148 
1229 1614 1615 1762 1785 



Clontech 



5-8 11 40 47 57 65-66 82 85 102 
124 163 171 216 222 224 235-236 
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A 
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Tissue Origin 1 RNA Source 



fetal kidney 



fetal kidney 



fecal lung 



fetai lung 



Ciontech 



Invitrogen 



Ciontech 



Invitrogen 



fetal lung" 



fetal ixver- 
spleen 



Ciontech 



Coluinfaia 
University 



Hyseq 
Library Name 



FKD002 



FKD007 



FLGOOl 



FLG004 



FLSOOl 



SEQ ID NOS: 



258 277 280-281 307 310 314 330 
371 387 392 39S 403 422-423 431 
436 443 455 469 500 519 522 542 
563 572-573 585 600 619 623 650 
654 657-658 650 679 719 731 760 
798 821 833 844 854-855 857 864 
868 878 911 929 958 960 969 990 
992 1007 1046 1087 1103 1129 
1139 1285 1312 1331 1355 1369 
1371 1376 1391 1422 1425-1426 
1440-1441 1470 1543 1598 1601 
X618 1631 1651 1654-1655 1669 
1678-1579 1691-1692 1733 1785 



352 384 426-427 440 S83 602 lO^O" 
1131 1324-1325 1636 



20-21 82 163 335 679 988-989 
1000 1227 1230 1320 1554 



35-36 94 323 371 398 426-427 445"" 
473 549 560 604 616-617 626 631 
649 651 719 746 786-787 832 842 
849-850 864 894-895 1075 1178 
1182 1200 1206 1309 1311 1345 
1429 1493 1567 1S76 lff20 1686 



9 15-16 29 41 47 68-69 83 88-89 
102 124 137 152-153 165 196 224 
229 231 24^ 254 256 267 291-292 
300 325 333 344-345 352 373 376 
379 384 408 425-427 430 432 467- 
46a 475 403 488 493 Sl6 531 S35 
545 547 549 564 562 602 623 644 
660 6€2-eG^ 670 673 725-726 728 
761 766-767 774 80S 830 852-853 
864 875 921 932 937 946 949 963 
988-989 1014 1016-1017 1024 1027 
1090 1097 1170 lies 1200 1215- 
1216 1224 1258 1290 1309 1320 
1342 1347 1355 1369 1381 1413- 
1414 1431 1438 1449 1491 1512 
1536 1547 1557-1560 1567 1590 
1601 1636 1644 1653-1655 1662 
1667 1671 1675 1680-1681 1706 
1739 1760-1761 1769 



103 276 
1614 16 



334 465-466 
58 



737 843 1131 



3-11 13 
51 54 S 
77-80 8 
110 112 
135-139 
157 163 
180 186 
200 202 
233-236 
255-256 
274 276 
293 295 
311 314 
332 342 
58 360 

86- 387 
06 408 
37 439- 
56 459 

87- 488 
506 509- 
529 531 

53-554 
6 579 



15-21 25 30 
6-58 60-66 6 
2-83 85 87 8 
116-124 126 
141 144 147 
-165 167-172 
188-190 193 
-206 210-214 
240-244 246 
258 261-265 
-278 280-281 
299-301 304 
316 316 320 
344-345 350 
362 370-374 
390 392-393 
410-412 415 
-442 444-445 
461-470 472- 
490-491 493 
-513 515-520 
534 536-540 
561-562 564 
581 583 585- 



-39 41-4 
8-69 72 
9 92-103 
-127 130 
149 152 
174 176 
194 196 
219 221 
-247 250 
268-269 
284-266 
306-307 
-321 326 
352-353 
376 378 
400-401 
417 419 
448 452- 
-479 481- 
500-501 
522-524 
542 547- 
567-558 
-597 599- 



8 50- 
75 
105- 
133 
-153 
-178 

198- 
-231 
-251 
272 
288 
309 
329- 
356- 
-384 
403 
422- 
•454 
483 
503- 
526- 
549 
571- 
605 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



607 610-613 615-621 623-624 626 
628-634 636-640 644 647-650 6S5- 
660 665 665-670 672 674-675 678 
681-682 684 690-695 697 702 708- 
710 713-714 716-713 725-728 730- 
731 734 736 738 740-741 743-746 
748 7S0-7S1 759-766 768 772 7v74- 
777 779 783-788 793 796 798 800- 
80S 808 010-812 014 81Q-819 821- 
824 826-832 834-837 843-847 849- 
867 869-876 878-883 887 689*895 
897-896 902 904-914 916 919 921- 
928 930-937 939 945-950 SS3-958 
960-961 963-965 967 969 971 974- 
978 980-983 936 988-990 992-993 
995-997 1000-1002 1004-1008 1012 
1014 1016-1019 1025-1026 1028- 
1031 1033 1035-1036 1039-1044 
1047 1049-1050 1053-1056 10S8- 
1059 1061-1064 1067-1070 1072- 
1074 1076 1078 1062 1085-1087 
1089-1090 10S7 1099-1103 1107- 
1113 1115-1119 1121-1123 112S 
1127-1128 1131-1134 1136-1137 
1144-1150 1153 1159-1160 1163 
1170 1175 1177-1178 1188 1190- 
1192 1195-1200 1202 1206 1208- 
1211 1214 1216 1218 1221-1222 
1225 1227 1234 1237 1241 1244 
1246-1247 1251 1254 1258 1261 
1266 1268 1270-1273 1277-1282 
1284-1285 1287-1290 1294 129^- 
1300 1306-1308 1313-1320 1324- 
1325 1327 1330 1332-1333 1338 
1341 1343 1345-1347 1349-1350 
1353-1360 1362-1363 1365-1367 
1369-1370 1372-1374 1376 1378- 
1381 1383-1384 1386 1389-1391 
1400 1402-1403 1405-1410 1413 
1415 1417-1419 1422-1429 1431 
1435-1437 1439-1442 1445-1446 
1448-1449 1454 1458-1459 1466- 
1470 1472 1474 1477-1478 1480 
1482 1485 1491-1493 1496-1498 
1501-1507 1509 1511-1512 1516- 
1519 1524-1S26 1529 1S32 1536- 
1541 1546-1547 1549-lSSO 1552- 
1554 1562 1S64 1569 1572 1574- 
1575 1578 1581 1583 1587-1588 
1591-1592 1594-1595 1597-1598 
1600-1604 1611-1612 1614-161S 
1617-161B 1620-1622 1624-1625 
1627-1628 1630-1632 1634-1639 
1645-1651 1653-1662 1664 1667- 
1669 1671 1673-1674 1676-1688 
1690 1696 1701-1703 1706-1709 
1711 1713-1714 1718-1719 1722 
1724-1727 1731-1733 1738 1740- 
1741 1743-1744 1746 1748 1751- 
1752 1754 1760-1765 1767-1773 
1790 1783-1766 



fetal liver- 
spleen 



Columbia 
University 



FLS002 



3-11 13 15-21 26 29 32 35-39 42 
44-45 48 50-51 54-55 57-58 61 S4 
68-69 73-75 78 80 82 84 87 95-98 
100 103 105 107-108 110 112-113 
116-119 122-125 128 130 137-133 
145 147-153 155 157 159 161-163 
166 168 171-172 174-175 177 181 ^ 
188-189 193-194 196-198 200-203 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



206 212-21S 219-221 223 22S-229 
231-232 240-244 246-247 250-251 
258-259 262 264 268-269 272 27S 
277 280-281 284 286 288 290-292 
29S 298-299 301-304 306 308-310 
318 320-321 323 325 329 331 334 
342 348-349 3S2-3S3 356 359 368 
371 374 376-379 381-384 386-387 * 
392-393 397-398 400-401 403 410- 
413 421 423 426-427 429-430 433- 
436 438 440 443 445 448 451-452 » 
454-455 460-463 46S-467 469 471-"" 
473 475-476 478-479 481-483 487 ' 
490-491 493-494 497 500-501 SOS- 
SOS 509-513 S15-S17 519-520 S24 
526-531 S34 S37-S42 544 547 552- 
554 556 5S8 561-562 564-567 571- 
577 583-587 590-591 593 595 597 
601 604-606 608-613 616-617 619- 
624 626-632 634 637-642 644 647 
649-652 6S4-6S9 662-665 669-672 
674-675 681-682 685 688 690 696 
698 700-703 707 709-710 713 717 
719-721 723-724 728 731-732 734 
737-738 742-745 748 7S2 754 759 
763-766 768 770 773-777 780 782 
784 786 791 795-798 801-802 805 
808 811-812 818 823-824 826-827 
832 934-837 839 843 846 84d-8S6 
358-861 865 867 869 871 873-874 
876 878 881-882 887 889 892 894- 
898 901-902 904 906-908 913-915 
919 921-924 926-932 934-935 937 
939-941 943 946-947 950 953 958 
961 965-967 971 973-975 977-979 
981 984-98S 990 992-993 995-997 
999 1001 1004-1007 1009-1011 
1013 1016 1020 1023 1025 1027- 
1031 1033-1035 1039-1042 1044- 
1045 i:.9 1053 10S5-1056 1058- 
1059 1062 1064-1065 1067-1070 
1072-1074 1079 1082 1087 1089 
1093 1097 1099-1103 1105-1107 
1109-1114 1123 1125-1127 1132- 
1134 1140 1143-1145 1148-1150 
1156 use 1160 1163 1172-1173 
1177-1178 1181-1184 1190-1192 
1195-1197 1199 1204 1206 1208 
1211 1214 1216 1219 1227 1230 
1234-1235 1237 1240-1241 1243 
1245 1247 1256 1258 1260-1261 
1264 1268 1270-1271 1275 1278- 
1279 1284-1286 1286-1289 1299- 
1301 1306 1308 1312 1314 1317- 
1319 1323-1325 1327-1330 1334- 
1335 1339 1343-1347 1349-1350 
1354-1355 1357 1360 1362-1363 
1365-1367 1369 1372 1376 1378- 
1380 1386 1389-1391 1394 1400 
1403 1406 1409 1416-1419 1422- 
1427 1429 1435 1437-1438 1440- 
1442 1446 1448-1450 1453 1460- 
1461 1468 1470 1472 1474-1475 
1478 1482 1486 1490-1493 1496 
1498 15C0-1S04 1506 1508-1509 
1511-1512 1516 1518-1519 1521 
1524-1528 1531 1536-1538 1543 
1547 1550 1554 1SS6 1564 1567- 
1569 1580 1S87-1S88 1591-1592 ^ 
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Tissue Origin 



RJNA Source 



Hyseq 
Liibrary Name 



SEQ IP WOS: 



1597 


-1598 


1600- 


X601. 


1611- 


1612 


1618 


-1628 


1530- 


1631 


1635- 


1638 


1641 


1G46- 


1649 


1652 


16S4- 


1659 


1661 


-1662 


1664 


1667 


-1669 


1674 


1676 


-167$ 


16S3- 


1684 


1686- 


1688 


1691 


-1692 


1699 


1702 


1707 


1711 


1713 


-1714 


1717 


1719 


1722 


1726- 


1727 


1730- 


1733 


1738 


1740 


1743- 


1744 


1748- 


1752 


1758 


1760- 


1761 


1763 


-1764 


1767 


1769 


1772- 


1773 


1776 


1779 


1783- 


1786 







fetal liver- 
spleen 



Columbia 
University 



FXiSOOa 



103 300 318 331 352 372 379 381 
384 392-393 403 422 424 429 434- 
435 440 444 4S3 503 515 544 592 
978 1064 1324-1325 1327 1333 
1357 1369 1378 1418 1424 1622 
1646 1649 1680-1681 1689-1690 

1717^ 17 43-1744 1769 

15-16 26 34 58 61 64 70 75 78 89 
98 105 112 116 120-121 123 133 
151 166 176 180 194-196 198 200 
204-206 210-211 220 225-226 230 
23S-23G 239 247 2B9 261 267 272 
277 280-281 303 310 313 317 320- 
321 329 344 356 371 374 376 379- 
382 39S 408 412 414 419 429 434- 
43S 441-442 465-466 490 494 504- 
506 509 S22 S27 534 552-553 562 
567 569-570 572-574 607 631 657- 
658 667 669 672 685-686 702 717 
725-726 732 748 7S9 761 778 784 
786 a09 817 829 037 857 861 872- 
873 875 881 889 694-S9S 909 911 
916 954 963 967 974 977 986 988- 
989 993 995 997 1000 1005-1006 
1008 1014-1015 102O 1042-1043 
1070 1086-1087 1089-1090 1118- 
1119 1122 1144-1145 1148 1153 
1157 1159 1183 1195-1196 1227 
1350 1257-1258 1262 1267 1280 
1285 1307 1312 1314 1317-1320 
1344-1345 1349-1350 1355 1362- 
1363 1403 1405 1415 1419 1425- 
1476 1429 1431 1442 1448 1463- 
1464 1469-1470 1489 1528 1S3S 
1S39 1549-1550 1S57-1S62 1577 
1583 1598 1601 1611 1615 1622 
1644 1649 1666 1674 1706 1721 
1738 1746 1763-1765 1774 1776 
1779 



fetal liver 



Invitrogen 



fetal liver 



CI on tech 



676 998 1719 



fetal liver 



Clontech 



FjUV004 



fetal muscle 



Invitrogen 



FMSOOl 



93 133 214 301 3SS 3 
581 601 679 837 847 
1236 1270 1313 1324- 
1355 1367 1425-1426 

1733 1760-1761 

26 37-39 50-51 58 84 
113 128 131-132 139 
194 198 201 206 211 
261 276 282 286 302 
376 379 383 398 412- 
436 448 452 4^2-463 
519 529 561 569-570 
607 623 626 635 647 
725-726 730 733 761 
826 837 860 874 913 
970 980 986 988-990 
1001 1007 1014 1027 
1045 1O60 1064 1070 



74 379 555 
859 1123 
1325 1327 
1536 1690 



86 89 98 
155 172 186 
230-231 256 
325 359 361 
413 419 430 
473 477 503 
590-591 597 
660 672 715 
775-777 788 
915 921 935 
992 1000- 
1035-1036 
1083 1097 
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Hyseq 
Library Name 



SEQ ID NOS: 



1099- 


1102 


1116- 


1117 


1121 


1164 


1173 


1198 


1208 


1228 


1240 


12S6 


1266 


1270 


1277 


1298 


1317 


-1320 


1324- 


1325 


1329 


1336- 


1337 


1369 


1383- 


1384 


1399- 


1400 


1403 


1409 


1433 


1505 


1S14 


1542 


1551 


1554 


1557- 


1559 


1562 


1589 


1599 


1620 


1632 


1644 


ISSO 


1652 


1671 


1675 


1712 


1725- 


1726 


1743- 


1744 


1754 


17^6 













fetal muscle 



Invitrogen 



FMS002 



119 221 273 402 426-427 463 547 
599 736 869 1000 1033 1083 1266 
1431 1440-1441 1468 1545 1599 
1673 1678-1679 1687-1688 1710 
1712-1714 1723 1725 1731-1733 
1743-1744 1760-1761 1767 



Invitrogen 



FSKOOl 



1 4-11 15-16 20-23 25 29 33 40 
43 46 56-57 60-61 64-66 75 82 87 
97-98 105 107-108 113 118-119 
123 133 125-137 139 144 146 148 
151-153 156 163 170 176 180 188- 
189 197-128 200 202-203 210 218 
222 231 246-247 261 263 265-270 
277 285-286 290 293 299 301 307 
311 321 325 328 330 333-335 339 
341 345 351-352 355-356 358-359 
362 368 370 3 72 376 379-382 384 
388 394 404-405 408-409 411-412 
419-420 424 426-427 436 441-442 
445 448-449 454 462 465-466 472 
476 490 493 504 506 509 SlS-517 
519 526 531 537-540 547 S49 560- 
561 567 572-573 581 584 589 611- 
612 615 623 630-631 635 647 649 
651 6S7-658 660 6$2-6€S 667 669 
672 676 678 681 688 701 704-705 
709-710 713 717 720-721 725-726 
728-729 732 74 8 750 753 759 7€4 
766 770 775-777 780-781 786 788- 
789 798 809 811 814 816-817 822 
824-826 831 842 857 859 861 863- 
864 881 894-895 908 910-911 916 
918 922-923 928 932-933 935 937 
946 948-949 '953 960-961 966-967 
970 975 977 986 990 992-993 999- 
1000 1004 1007 1013 1018 1025 
1027 1032 1035 1041-1043 1054 
1057-1058 1060 1062-1064 1069 
1072 1077 1090-1091 1097 1099- 
1103 1108 1113 1119 1123 1128 
1131 1134 1140 1148-1149 1152- 
1153 1156 1163 1167 1178 1182 
1189 1192 1195-1196 1198 1201- 
1205 1208 1211-1212 1216 1219- 
1220 1222 1225 1240 1243 1258 
1266-1267 1274 1277 1280 1282- 
1285 1299 1310 1317-1322 1324- 
1325 1329-1330 1342 1344 1346 
1349-1351 13S4-1357 1365-.1366 
1369 1371 1373 1376 1378 1380 
1383-1384 1387 1399-1400 140S 
1410 1427 1429 1431 1433-1435 
1439-1441 1448-1449 1454 1457 
1468 1470 1472 1475 1480-1481 
1487 1490-1491 1493 1498 1509 
1512 1B21 1525-1526 1529 1535- 
1536 1547 1549 1657-1SS9 1588 
1592 1595 1597-1598 1601 1603- 
1604 1608 1611 1614 1618 1624- 



127 



NSCX^ID:<WO 0153312A1 [ > 



wo 01/53312 



PCT/USOO/34263 



Tissue Origin 



RJIA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1626 1632 
1644 1646 
1665 1668 
1702-1703 
1724 1727 
1742 1747 
1765 1772 
1786 



1634 X636 
1654-1657 
1675 168B 
1709-1710 
1731-1732 
1749 1755 
1776-1777 



1641 1643- 

1660-1662 

1687-1689 

1716 1719 

1737-1740 

1760-1761 

1779-1780 



fetal skin 



Invitrogen 



13 286 302 30 
339 341 354 370 
408 414 426-427 
515 544 585 598 
1076 1109 1155 
1333-133S 1343 
137i 1377-1378 
1466 1647 16S6 
1688 1693 1718 
1732 1739 1755 



313 321 33 
372 385 4 
433 436 4 
767 810 8 
1317-1320 
1347 1350 
1391 1397 
1678-1679 
1721 1725 



0 335 
00 402 
50 454 
45 939 
1326 
1369- 
1422 
1687- 
1731- 



fetal spleen 
umbilical 'go?3' 



BioChain 



BioChain 



FSPOOl 



110 137 211 353 589 927 1108 
1639 1771 



FUCOOl 



114 
154 
184 



392 
420 
454 



4-8 10 12 14 17 33-36 44-46 57 
64 68-69 75 82 85 101 104 113- 

116 119 122-124 133 137 153- 
1S7 161 163 166-167 175 181- 
186 192 197-198 200-202 212- 
215 230 234 246-247 251 2S6 263 
267 271-272 280-281 284 295 301 
314 317 321 326 333-335 345 351 
356 368 371-373 379-380 386 390 
394 406 408-410 412 414 416 
424 427 430-436 438 444-446 
459 461 463 467 473 482-483 
466 488 490 495 504 509 524 526 
537-540 547 5S5 S61 574-577 5B8- 
591 593 606 615 620-621 632 637 
C45-647 650 6S9-660 662-664 667- 
^68 674-675 684 687 696 698 701 
703-705 709 711 714 719-720 725- 
727 732 749-750 762 765 771 775- 
777 780 789-791 793 796 802-803 
814-817 822 833 843 845 848 858 
861 864 875 879 888 894-895 897- 
900 903 905-907 911-912 925 930- 
933 936 940 948 953 960 966 977 
984 990 992 998 1000-1001 1005- 
1O07 1016 1023 1025 1037 1046- 
1047 1059 1061-1063 1073 1076- 
1077 1089 1094-1097 1112-1113 
1X15 1134 1144-1148 1151 1154 
11S6 1163 1171 1197 1204-120S 
1208 1216 1218 1224 1234-1235 
1243-1244 1246 1279 1283 1286- 
1287 1298 1316 1320 1344 1346 
1350 1357 1359 1371 1373 1375 
1381 1398 1400 1403 1408 1414 
1424 1427-1428 1431 1433 1440- 
1442 1446 1454-1455 1479 1482 
X484-14B5 1489 1492-1493 1504- 
1505 1513 1525 1527 1536 1538 
1546 1565 1567 1571 1573 1575- 
1576 1578-1579 1591 1595 1600- 
1601 1608 1612 1615 1621 1624 
1626 1636-1637 1647-1648 l65l 
1653 1656 1658 1661-1662 1672 
1675 1682 1684 1686-1688 1690 
1709-1710 1722 1727 1729 1735- 
1738 1740-1741 1760-1761 1768 



4 9 11-13 17-18 22-23 25 37-39 
42-47 50-51 S4-SS 58 60-61 65-66 



fecal brain 



GIBCO 



HFBOOl 
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Tissue Origin 



RNA Source 



Hyseq 
Ljibrary Name 



SEQ ID NOS: 



72 75 77 80 82 85 90-91 94 100- 
102 107 110 112-116 118-119 122- 
123 126 128 134 136-140 147-148 
153-155 157 161 165 169-172 175 
131 186 188-189 197-198 204-206 
208 210 215 222-223 22S-226 230 
235-238 240-;>41 247 253 256-258 
260-262 267-269 276 279-281 284 
286 289 298 300-302 307 310 318 
321-323 325 330-331 339 341 346- 
345 352 354 356-359 362 364-365 
371-372 377 379-380 382 384 387 
390 400 408 414-416 419 424 431 
434-435 438 441-443 449 4Sl 453- 
455 457-463 470 472-473 475 477- 
478 482-483 486-488 490-491 493 
496 499-500 502-504 506-507 509- 
S12 516 519-520 522 525-526 529- 
530 537-540 543-544 546-547 566- 
567 S69-S70 572-582 585 588 590- 
591 593 595 599 601 504 606-609 
6il-6l2 614-620 622-524 630 632 
636 643 645-647 650-652 654 659 
661 665 667-668 670-672 676 678 
681 687 689 692-694 697 699 710 
714 717 721 727 729-732 734 736 
738 743-746 750-751 759 763 766 
770 772 775-777 784 789 791 796 
799 802-805 810-811 814 819-821 
824 826 830 834-837 839-850 854- 
8S6 858-860 862 864 869 871 876- 
877 879 883 886-867 890-891 893- 
89S 898-901 905 908-910 912-916 
919 922-923 925 927 930-933 935- 
938 948 952-960 963-964 967 969- 
972 975 978-979 981 983 986-987 
990 992 995 997 999-1002 1005- 
lOOS 1011-1013 1016 1018-1019 
1023 1026 1029-1031 1033-1035 
1038 1041 1047 lOSO 1053 1057 
1059 1064 1068 1070 1072-1073 
1078-1079 1081-1082 1036 1089 
1094 1097 1103 1107-1109 1113- 
1115 1121-1122 1127 1134-1135 
1138 1140 1143 1148-llSl 1153 
1156-1157 1159 1167 1170 1175 
1193-1194 1200 1202 1207-1209 
1211 1216 1219-1220 1226-1227 
1229 1232-1234 1240-1241 1243 
1246 1249-1251 1253-1254*1258 
1267-1268 1271 1276 1279 1282 
1285-1289 1293-1294 1305 1307- 
1308 1312 1316 1320 ;.327 1338- 
1339 1341-1344 1346 1349 13SS- 
1357 1359 1365-1366 1369-1370 
1373-1375 1379 1386 1389 1394 
1398 1409 1413-1414 1416-1417 
1420-1421 1425-1427 1430 1433 
1437 1439 1442 1445-1452 1454- 
1457 1459 1463-1464 1468 1470 
1474 1477-1479 1489 1492 1494 
1497-1498 1501-1503 1507 1509 
1511-1513 1517 1520-1521 1524- 
1526 1531-1533 1535 1S37-1S38 
1547 1554 1556-1559 1564-1567 
1571 1584 1587 1589 1594 1599- 
1601 1611-1612 1614-1616 1619- 
1620 1625-1628 1630-1631 1634 
1637-1638 1640-1643 1645 1648- 
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Tissue Origan 



RWA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1649 
1664- 
1679 
1704- 
1720 
1737- 
17SS 
1779 



1651 

1665 

1683- 

1705 

.1724 

1738 

17S7 

1785 



1653-1655 
1667 1669 
1684 1686 
1709 1713- 
1727-1728 
1743-1744 
1760-1761 



1657- 

1673 

1693 

1714 

1731- 

17B2 

1765 



i6se 

1678- 

1701 

1717- 

1733 

17S4- 

1772 



macrophage 



Invitrogen 



HMPOOl 



S-B 110 204-205 503 
87R 933 988-989 1379 



634 678 859 
1448 1504 



infant brain 



University 



10 12-13 15-18 22-23 
37-39 43 47 50-51 54 
65-66 60-69 72-74 80 
88-92 97 100 102-104 
112-113 115-115 118 
134-136 138-139 143 
152 154-155 163 165- 
175 181-184 186 193 
203-205 209-210 214 
226 231-232 235-236 
252 257 260 268-269 
279-281 286 288 291- 
300-301 304 307 310 
330-331 333-334 339 
352 3S6-3S7 362 371- 
380 383-384 392 397 
411 413-414 416 418- 
430-431 434-435 438 
454 461 464-466 469- 
475-476 478 482-483 
494 497 503 507-S08 
519-520 524-526 530 
547 5S0-S51 561 563 
572-576 579 581-582 
591 593 595-597 607- 
616-617 620 622-624 
641 645-647 650-655 
665 667-675 689 691 
703 707 713-715 717 
733-736 739 743 745 
763 769-770 772 778 
788-789 793-794 799 
814 825-826 830 834- 
845 848-850 854-855 
865 870 872 875-876 
890-891 894-895 898 
917 919 922-925 527- 
934-936 938 941 915 
953-954 959-962 966 
981 986-990 992 997 
1004-1006 1014 1016 
1024-1025 1033 1036 
1052 1054-1055 1057- 
1064 1068-1070 1073 
1085 1089 1108-1113 
1123-1124 1130 1132- 
1149 1151 1153-1154 
1172 1174-1175 I1S3- 
1190 1193-1194 1196- 
1204 1208-1209 1211 
1226-1227 1229 1231 
1247 1249 1251 12SS 
1262 1269 1274 1279 
1285 1287-1289 1294 
1307 .1313-1314 1316- 
1332 1341-1342 1345 
13€Z-13e3 1365-1366 
1374 1381 1383-1384 
1403 1406-1407 1413 



25 29 34 
56 58 60-63 
62-83 86 
106-108 110 
123 128 130 
147-149 151- 
167 169 172- 
196 198 201 
215 222 224- 
239 246-247 
272 276-277 
292 295 298 
313 321-323 
345-347 349 
372 377 379- 
401 406 408 
419 422 428 
443 449 453- 
470 472-473 
487 490 492 
510-S13 516 
534 536-540 
564 566-567 
584-507 590- 
609 611-613 
627 631 637 
657-658 660- 
695 697 699 
721 728-731 
751 755 759 
780-781 785 
803 BOB all 
836 840-843 
860 862 864- 
878 886 asd 
903-904 916- 
928 930-532 
946 948-950 
969 977 979 
999-1000 
1018-1019 
1047 1051- 
1059 1063- 
1081-1082 
1118-1120 
1138 1140 
1163-1170 
1184 1188 
1197 1199 
1218-1222 
1234 1241 
1258 1261- 
1281 1383 
12S5 1305 
1320 1329 
1349 1356 
1368-1370 
1388 1400 
1417 1420 
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Tissue Origin 



RNA Source 



infant brain 



Hyseq 
Library Kame 



Coiunrtbia 
University 



'~SEQ~ID~N0ST 



IB2003 



infant brain^ 



infant brain 



1423 142y^l43i 143S-1436 I43"9^ 
1441 1443 1447-1449 1451-1452 
14b4-1455 1457 14S9 1463-1465 
1468 1470-1471 1475 1479 1482- 
1483 1485 1493-1494 1496 1490- 
1499 1502-1503 1S05-1S07 1509 
1522-1523 1525 1528 1531-X533 
1542 1546-1547 1549-1550 1S54- 
1555 1563 1555-1567 1569 157S 
1580 1583-1586 1588 1590 1592- 
1593 1S9S 1598 1600-1601 1608- 
1610 1612 1614-1616 I6l9 1621 
1624 1626-1627 1630-1633 1637 
1639-1640 1642 1644 1647 1652 
1654-1655 1658-16S9 1664-1665 
1672-1673 1676-1681 1685-1688 
1693-159S 1701-1702 
1717-1720 1723-1724 
1733 1735-1741 1743-1744 1752 
1755-17S8 1762 1765 1771 1774 
1777-1778 1786 



1704 1708 
1736-1728 



Columbia 
University 



Columbia 
University 



IBM002 



17-18 20-i;3 29 34 43 50 68-69 

78-80 88 100^101 107 110 -112 118 
123 128 133 135-137 146 148 152 
159 166 169 174 194 198 203 215 
223 225^226 229 235-236 247 260 
276-281 286 290-292 295- 3Q0-3QX 
310 322 324 331 334 339 346-347 
349-350 352 357 371 376-377 382 
384 403 408-409 414-415 453-455 
472 476 478-479 490 503 507 516 
520 530 534 536-540 SSI S63 572- 
576 585 587 590-591 593 595-596 
60X 606 612 €16-617 620 622-624 
650 652-553 661 665 670-671 674- 
67S 678 609 715 7l7 727-728 730 
734 759 775-777 780-781 785 796 
806-807 811 824 845-846 864 869 
875 882 889 894-895 898 904 917 
919 921-923 932 93S-936 946 950 
954 962 977 979 997 999-1000 
lOOS-1006 10Q9 1011 1017 1024 
1033 1037 1043 1055 1057 1109 
1114-1115 1120 1123 1127 1144- ^ 
1145 1149 llSl-1153 1160 1167 
1170 1174 1193-1194 1196 1199 
1202 1206 1209 1220-1221 1226 
1229 1240-1241 1251 1258 1284 
12BB'X289 1305 1314 1327 1333 
1344 1347 1350 1355-1357 136S- 
1366 1378-1379 1388 1400 1403 
1421 1423 1431 1436 1440-1441 
1446-1447 1457 1459 1471 1493 
1503 1507 1509 1536 1546 1SS7- 
1559 1567 1572 1587 1595 1598 
1610-1612 1615 1631 1639 1644 
1647 1657-1658 1673 1678-1681 
1G82-±6Q^ 1701-1702 1706-1709 
1713-1714 1719 1757 1760-1761 
1765 1771 1778 



IBSOOl 



101 lii 139 152 260 279 290-292 
374 377 551 563 608-509 653 659 
814 954 1005-1006 1029-1030 1130 
1164 1209 1258 1294 130S 1320 
1327 1397 1431 1498 1507 1615 
1640 1694-1695 1763-1764 1767 
1779 



xu i'2 119 175 279-281 321 

371 446 551 563 623 6S2 667 669 



131 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



671-672 819 94 9 966 1113 113 0 

1X51 1188 1193-1194 1196 1229 

1258 1265 1271 1287 1317--1319 

1324-X325 1342 1423 1440-1441 

1448 1471 1482 1525 1532 1546 

1562 1569 1588 1591 1610 1618 

1647 1649 16S8 



lung, 
fibroblast 



Strategene 



LFBOOl 



lung tumor 



Invitrogen 



9 17 20-21 25 
153 157 197-198 
213 223 262 266 
333 356 370 427 
472 493 498 503 
537-540 542-544 
599-600 607 615 
692-694 712 719 
794-796 810 837 
856 869 876 903 
964 975-976 984 
1024-1025 1033 
1070 1072 1082 
1136-1138 1140 
1233 1246 1279 
1320 1334-1335 
1446 1478 1482 
1SS5 1567 
1625 1632 
165S 1662 1680 
1690 1696 1702 
1760-1761 1778 



1552 
1620 



69-69 82 94 105 
203 207-203 212- 
233 302 321 326 
430 436 446 462 
516 S19 627 S3S 
562 565 567 586 
630 647 662-664 
745 748 775-777 
843-847 849 854- 
934 953 955-956 
1000 1005-1007 
1039 10S3 1064 
1112-1113 1134 
1195 1223 1232- 
1285 1295 1311 
1343 1427-142S 
1493 1504 1537 
1575 1582 1598 
1638 1645 1654- 
1681 1684 1686 
1711 1733 1741 
1785 



S-10 18 20-2T 29 33-36 40 43 52 
S4-SS 61 65-66 68-70 73-75 80 85 
88-89 93-94 100 103 106-108 112- 
113 115-116 118-119 123-124 126 
130-132 13S''137 139-141 143-144 
147-148 151-153 1S5-15S 159 161 
164 169 171 179-180 IBS 190 192 
194 196-199 203-208 210 212-214 
216-217 219 222 233 240-241 244 
246 251-252 2SS-256 261-262 266, 
272 276-277 279-281 284 286 288 
290 295 298 301-302 309-312 317 
321 329 332 341-342 344-345 348 
352 358-360 363 368 370-371 376 
380-381 384 389-390 398 400 409 
414 423 426-427 430 432-436 443- 
444 450-451 454 462 468 472-477 
480-483 487-488 490-491 493 496- 
498 500 503-506 509-512 515-516 
519 521-523 526 530 534 541 544 
547 554 557 564 566-567 S72-S76 
585-586 588-589 595-596 601 607 
611-612 615 619 621 623 626 630 
632-533 644 647 649 651 655-656 
660 662-665 667 659 672 683-684 
696 700 706 710 713 716 718-719 
722-723 728 734-739 743 750 752 
763 765-766 773-778 784-765 787- 
789 791 BOO 802-803 809-812 814 
824 826 828-829 832 838-839 841- 
845 849-850 852-8SS 857-861 864 
866 874 878-880 882 887 890-891 
897-898 $02 904 906-907 910 916 
918-920 922 924-925 927 930-932 
934-935 937 947 950 953 9S5-956 
961 963 966-967 969 971 977-979 
981 984 986-987 990 992-993 995 
9S7 999-1001 1005-1007 1009 
1012-1013 1018 1020 1022-1024 
1026 1029-1030 1033 1038 1041 
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RNA Source 


Hyseq 


SEQ 


ID NOS: 






Library Name 














1045 1047-1050 


1052 


1054 - loss 








1059 1063-1064 


1067 


-1071 1073- 








1074 1076 1085 


1067 


1089 1095- 








1097 1104 1106 


-1107 


1109 1112 








1116-1117 1119 


1126 


1134-1135 








1139 1141-1142 


1144 


-1X45 1148 








1152-1153 1156 


-1158 


1167 1170 








1172 1178 1195 


-1196 


1198-1200 








1202 1204 1208 


X214 


1216 1219 








1222 1227 1234 


1241 


1247 1252 








1257-1258 1265 


1267 


-1270 1276 C 








1278 1280-1281 


1283 


1285 1288- 








1289 1295 1300 


1305 


1308 1312 








1317-1321 1329 


133 8 


-1339 1341 








1344-1346 1349- 


-1351 


13*^3-1 








1357 1365-1366 


1369 


1378-1379 








1383-1385 1394 


1397 


1 4 on 1 A n"? _ 

l.'iKJKJ JL'tUZ — 








1403 1408 1417 


1419 










1431 1433-1436 


1438 


1 A41d, 1 AAfZ 
•LH.H.H X^-^D — 








1448 14S4-1455 


1460 


1 Adf^ 1 ACQ 
i.'iX>X> X409 








1470 1474 1480- 


14 B 1 










1488 1490-1491 


1494 


-xiyt> xDOb 








1SO8-1509 1511- 


1512 


1515-1516 








1519 1523-1524 




■* Xo^y ib3o - 








1540 1546 1549- 


1S50 


XOsD iibU- 








1561 1S6S 1567 


1569 


X3 / D XDO O 








1591 1593-1594 


1596 


-i.Z>^0 xcuu — 








1602 1608 1614- 


1616 


1 £1 n 1 (ton 








1624-1625 1627- 


1632 


xojtj Xojy 








1644-1645 1647* 


1649 


XD3<£ — X03J 








1656-1662 1664 


1666- 


■ XQ 3 / xo / u - 








1671 1673-1675 


1678- 










1685-1688 1690- 


1692 


1696-1 

U J/ U J, W ^ J 








1705 1709 1716- 


1717 


1722 1777 








1730 1735 1739 


1741 


1743-1744 








1748-1749 1753 


1760- 


1762 1765 








1767 1770-1771 


1773 


1775-1776 


_™ 






1778-1779 1786 






X yirtpfiocy t es 




LPCOOl 


4 11-12 18 24-25 30- 


31 48 50-51 








56-57 68-69 80 


92 98 


103 105 110 








126 137 152-153 


157 


165 172 188- 








189 197 203 210 


217- 


218 222-223 








225-226 229 231 


247 


251 256 264 








272 280-281 284 


300- 


301 321 325- 








326 339 348 352 


357 


371 382 384 








390 400 404 412 


414 


421 423 426- 








427 430-431 445 


447- 


448 451 454- 








455 475 503 516 


526- 


527 530 53 7- 








540 549 556-560 


563 


574 577 589 








602 613 615-617 


621 


623 628-630 








636-637 647 649 


657- 


659 690 697 








717 723 755 764 


775- 


777 780 786 








789-790 793 800 


802 


822 838 849 








866 869 876 881- 


•883 


892 898 906- 








907 911 921-923 


928 


975 990 992 








996 1001 1004-1007 1033 1050 








1054 1078 1107 1 


135 1140-1141 








1143 1348 1158 1 


163 1177 1199 








1205 1216 1226 1231 123€ 1241 








1244 1250 1258 12^0 1265 1269- 








1271 1290-1293 1308 1312 1317 








1319-1320 1339 1345-1346 1348 








1350-1351 1357 1 


367 1369 1379 








1381 1383-1384 1386-1387 1389 








1394 1397 1405 1423 1425-1428 








1431 1437 1446 1448 1461 1466 








1470 1472 1474 1482 X492 Z506 








1S2B 1S37 1546 1549 1591 1598 








1600 1603-1604 1606 1627 1636 ^ 
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Tissue Origin 



leukocyte 



RNA Source 



Hyseq 
Library Name 



I.UC001 



SEQ ID NOS: 



1638 1547-1649 1651 1658-1659 

1664 1676-1677 X680~l681 1687- 

1680 1699 1711 1715-1716 1726 

1728 1737 1740 1746 1748 1752 

1756 1758 1777 1775 



3-4 lO'll 13 15-18 20-21 24-25 
30-31 3S-36 40 43-45 48 50-Sl 
54-58 60-63 68-69 75 79-80 B2-83 
es 88-91 93-96 90 100 103-104 
107-108 112 116 119 123 125-128 
134-140 142 147-149 151 153 15S 
157 162-163 167 169-172 174 177- 
179 186 190 192-199 203-207 210 
212-21S 217-219 222-223 229 235- 
236 247 251 255-2S8 260 262 272 
274-277 280-201 285-286 297-301 
307-310 313-314 316-317 321 325- 
330 333-334 340-342 348-349 3S2 
354-358 370-371 380-385 387-388 
400 405 408-410 412 414-416 421- 
425 430-431 434-435 437 439 441- 
442 445-451 453-454 456 459 461- 
464 468-472 474~47S 481 483-485 
487-491 496 499-501 S03-S04 509- 
513 516-519 522 526-527 S29-S31 
534 536-540 542 547-549 SS3-659 
566-567 571 574-577 579 582 584- 
586 589 593 595-597 601-602 604 
606-607 611-613 615-621 623 627- 
629 633 636-637 642 644-650 65S 
659-660 662-665 667 669 674-67S 
678 682-684 692-696 698 700 706 
708 710 716-720 725-726 729-736 
738-739 743-746 749 751 753 756 
759 765-766 768 770-773 780 784- 
786 788-790 793 796 793 800 802- 
803 810-811 814 817 819 826 828- 
830 832 834-836 838 843 845-860 
863-864 866-871 877-679 881-892 
894-896 898 902 904-914 916 919- 
925 327 930-932 935-936 941-942 
945 948-949 953 955-956 958 960- 
962 964 967 970-971 973 975 977 
985-990 992-993 995-996 999-1002 
1004-1009 1011 1014 1017-1019 
1022-1023 1025 1027 1029-1031 
1033-1036 1038 1041 1043 1047 
1050 1053-1054 1058-1059 1061- 
1062 1064 1068 1070 1072 1078 
1085-1086 1089-1091 1093 1097 
1106-1107 1110-1113 1115-1117 
1122-1123 112S 1129 1132-1133 
1135-1137 1140-1145 1152 1158 
1163 1168 1170-1174 1176-1178 
1180 1182-1183 1186 1195 1198- 
1200 1202 1205-1206 1211 1216 
1219-1221 1223-1227 1230-1236 
1238-1242 1247 1252 1254 1256 
1258 1261-1262 1264-1265 1269- 
1270 1272-1275 1277 1280-1284 
12S7-1293 1299-1300 1306 1308 
1312-1313 1317-1320 1322 1324- 
1330 I333-133S 1339 1341 1343- 
1347 1349 1353-1357 1359-1361 
1365-1367 1369-1370 1373-1374 
1377 1379-1381 1386-1387 1394 
1400 1403 1409 1419 1423 142S- 
142B 1430-1431 1433-1434 1437- ^ 
1438 1440-1442 1446-1448 1450 
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Tissue Origin 



leukocyte 



me 1 an oma f rem 
cell line ATCC 
#CRIi 1424 



RNA Source 



Ciontech 



Clontech 



Hyseq 
Library Name 



LUC003 



SEQ ID NOS: 



1453 

1470- 

14 8 8 

1506 

1521- 

1531 

1549 

1S6S 

1594 

1608 

1626 

1639 

1653 

1670 

1692 

1711 

1727 

1744 

1762 

1784 



1458- 

■1471 
1490- 
1509 

■1522 
1534 

■ 1550 
1567 
1596 
1611 

•1629 
1641 

-1655 
1675- 
1696 
1716- 
1733 
1748- 
1765 
1786 



1459 

1474 

1493 

1512-^ 

1524- 

1538 

1553 

1575 

1596 

1614 

1631- 

1644- 

1658- 

1679 

1700 

1717 

1737- 

1749 

1769 



1463- 
1477- 
1496- 
1513 
1525 
1541 
1555- 
1580 
1600- 
1620- 
1632 
1645 
1660 
1684' 
1702 
1720 
1738 
1752 
1771 



1464 
1478 
1501 
1516 
1527- 
1545- 
■1556 
1589 
■1602 
■1621 
1636 
1648- 
1662 
•1688 
1707- 
1723 
1741 
1755 
-1772 



1468 

1482- 

1504 

1519 

1528 

1547 

1560 

1591 

1606- 

1624 

1638- 

1650 

1669- 

1690- 

1709 

1725- 

1743- 

1760- 

1781- 



T 35-36 44-45 61 68-69 75 82 102 
119 139 154 179 197 244 280-281 
324 372 404 430-431 455 461 476- 
477 481 503 537-540 554 575-576 
581 589 608-609 621-622 624 630 
632 647 662-664 669 679 698 764 
77S-777 802 848 851 856-857 
905-907 915 949 952 990 992 
1002 1113 1119 1170 1183 1216 
1236-1237 1241 1275 1346 1353 
1357 1359 1377 1506 1515 1534 
1553 1S91 1600 1613-1614 1621 
1628 1670 1676-1677 1691-1692 
1699 1733 1738 1772 



773 
879 



25 35-36 43 80 104 126 128 150 
163 166 188-189 197 210 215 220 
271 277 280--281 310 317 336-338 
345 351 372 380-381 383 387 412 
415-416 430 445 448 454 456 467 
461 490 499 503 526 528 546 548 
567 575-576 588 601 613 615 647 
660 665 734-735 737 7S9 778 787 
790 BOO 832 845 856 859 869 878 
883 887 90S 914 932 934 9S8 976 
985 990 992 999-1000 1025 1031 
1038 1050 1055 1068 1074 10B8 
1099-1102 1107 1136-1138 1149 
1156 1163 1172 1190 1195 1200 
1214-1215 1217 1226-1227 1235 
1238-1239 1244 1253 1278*1230 
1293 1311 1320 1330 1334-1335 
1345 1355 1367 1386-1387 1394 
1403 1406 1414 1423 1437 1442 
1465 1521 1529 1536 1539 1541 
1547-1548 1582 1620 1626 1631 
1638 1647 1653 1660 1667 1669- 
1670 1680-1681 1696 1704 1715 
1724-172S 1731-1732 1750 1760- 
1761 



manunary gland 



Invitrogeii 



MKGOOl 



S-8 10 12 14 
33-39 42-43 
71 73-74 79 
106 108 112 
146 148 ISO 
166 170-172 
188-190 194 
222 224 227 
251 253-254 
271 276-277 



18 20-21 24-25 29 
52 55-58 60-64 58-69 
80 82 89 98 100 103 
123 128 133-137 144- 
152 154 158-159 165- 
174 176 178 181-185 
198 201-206 210 217- 
228 231 233-237 247 
2S6 261-263 266-267 
279-281 264-286 288 
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Tissue Origin 


RNA Source 


Hyseq 
Library Name 


SEQ ID NOS: 








790 7.97 299 301 304 309-312 318 
320-321 323-325 327-329 331-332 
334 339 341 344-345 348 350 356 
359-360 362-363 368 371 376 379- 
303 380 390 393-395 397-39B 405 
408 412 414-415 423 430 434-437 
441-444 448 451-455 462-464 474 
476 479 482 485-486 488 490 494- 
4S5 498 503 506 509-512 516-517 
519-520 522 S27 S29 534 537-541 
547 549 554 5S7 562 572-574 587 
589-591 597 602 607 618 623 628- 
629 632 634-640 644 647-648 650- 
6S2 655 657-658 660 665 667 669- 
672 674-676 679 682 688 695-696 
706-707 710 713 717 720 722-730 
732-734 736 738 743 747-748 750 
755 7S9 761 766 770 780 784 706- 
789 794 0C3 806-807 809 814 817- 
822 827-829 837 842 B54-8S8 863- 
•864 866 869-870 872 878 881 889 
893-900 904 906-907 911 916 919 
921-923 926 935-937 946 948-949 
953-9S4 957 960-961 963 965-966 
970 977-978 984-989 993-997 
1000-1001 100S-10C6 1008 1013- 
1014 1016-1017 1023 1025 1027 
1032-1033 1036 1039 1043 1045 
loss 1057-1058 1063 1068-1075 
1077-1078 1085 1087 1089-1091 
109S-11Q2 1107-1108 1112-1119 
1121-X123 1131-1133 1136-1137 
1139-1142 1144-1145 1148-1149 
1153 11S9 1167 1170 1172-1173 
1183-1185 1190-1192 1196-1199 
1207-1208 1212 1216-1218 1222- 
1223 1225 1231 1234 1240-1241 
1247 1253-1254 1258-1259 1261- 
1262 1270-12S0 1283 1285-1286 
1298 1307 1314 1316-1320 1323- 
132S 1330 1334-1335 1342-1345 
1349-1352 13S4-135S 1359 1369- 
1370 1377 1379 1381 1383-1384 
1389 1405 1414 1419 1421-1423 
1425-1426 1428-1429 1431 1434- 
1437 1439 1448-1449 1454 1457 
14S0-1464 1466 1471 1480-1483 
14S7 1489-1491 1493 1505 1507 
1S12 1519 1526-1528 1532 1534 
1536 1S39 1542 1547 1S49-15S0 
1554 1561-1562 1564 1567 1572 
1576-1579 1S81-1532 1587-1588 
1592 1594 1596-1597 1601-1602 
1607-1608 1610 1612-1616 1618 
1621-1622 1625-1626 1631 l€35- 
1636 lb4X J.fa*i J — AO** iot / 
1652 1654-1655 1657-1658 1660 
1662 1664-1666 1669-1671 1673- . 
1674 1676-1677 1680-1685 1689- 
1692 1701 1706 1713-1715 1719- 
1720 1723-1728 1730-1732 1738 
1740 1742-1744 1746-1747 1749 
1751 1753 1760-1762 1765-1768 
1771' 1774 1776-1777 1779 1783- 
1784 1786 


indue eel neuron 
cells 


Strategene 


NTDOOl 


29 35-36 80 116 123 156 163 181 
214 230 280-281 284-285 307 321 
330 340 358 371 375 377 380 382 ^ 
422 424 492 497 532-533 542 546 
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Tissue Origin I RNA Source 



retinoid acid 

induced 
neuronal cells 



Hyseq 
Library Name 



Strategene 



neuronal cells 



pi-tuitary 
gland 



Strategene 



m-uooi 



Clontech 



placenta 



Clontech 



PIT004 



PLA003 



SEQ ID NOS: 



S49 S66 5 
734 77S-7 
856 858 a 
1041-1043 
1194 120G 
1288-1289 
1343 1359 
1623 1645 



5-8 78 2*5 
623 677 7 
1426 1547 



86 595 612 64S- 
78 780 792 799 
75 936 953 985 
1055 1072 1104 
1223 1246 12S3 
1291 1294 1311 
1412 1423 148S 
16B4 170S 1715 



8-269 277 383 4 
31 999-1000 119 



647 654 
831 826 
990 992 

1193- 

1274 

1320 

1S20 

1751 



31 506 
9 142S- 



29 65-66 80 82 110 lig 146 152 
166 174 181-185 198 227-228 2S3 
284 309 325 332 334 336-338 37S 
391 393 406 414-416 454 465-466 
470 488 503 506 S10-S12 519 537- 
540 572-574 597 602 607 623 647 
661 700 702 716 743 771 792 858 
904 948 954 977 1000 1005-1006 
102S 1064 1068 1122 1148 1185 
1219 1226 1234 1246 1271 1283 
1295-1296 1311 1317-1320 1329- 
1330 1350 1355 136S-1366 1378 
1383-1384 1400 1412 1445 1505 
1539 1547 1578 1647 1656 1683 
1690 1738 1749 1783-1784 



311 314379 408 419 430 454 10S5 
1095-1096 1272-1273 1312 1320 
1378 1652 1671 1720 1725 1736 
1741 1755 



5-8 124 208 277 370 843 906-907* 
1280 1317-1319 1359 1609 1621 
737 



prostate 



Clontech" 



PRTOOl 



Invitrogen 



RECOOl 



9 46 57 71 107 147 171 177 197 — 
201 229 231 242-243 274 280-281 
307 310 317 330 358 373 382-383 
400 430 434-436 461-462 469 477 
489 497 500 SOS-S06 513 S21 526 
531-533 547 618 649 6S7-6S8 662- 
664 710 729 767 771 789 820 861 
871 874 890-891 905 938 945 963- 
964 988-989 1002 1025 1033 1045 
1061 1095-1096 1112 1125 1142 
1196 1198 1202 1232-1233 1241 
1258 1272-1273 1287 1295 1313 
1333 1341 1344 1349 1360 1362- 
1363 1367 1437 1442 1447 147S 
1478-1479 1482 1469 1513 1517 
1S27 1531 1536 1598-1599 1628 
1636 1657 1680-1681 1687-1688 
1717 1738 1743-1744 



17-18 29 33 62-63 71 73-74 83 86 
113 126 146 153 158 167-169 195 
200 206 261 309 312 341 344 368 
373 388 395 408 414 420 430 441- 
442 446 448 464 468 483 517 537- 
540 547 567 585 S89 602 623 €28- 
629 632 645-647 651 657-658 €69 
717-719 721 725-726 730 748 750 
756 762-763 766 770 774 790 819 
825 843 049 851 881 903 909 948- 
949 960 986 996 1020 1023 1033^ 
1034 1064 1067 1070 1075 1086 
1108-1105 1113 1130 1139 1153 
1159 1172 1178 1185 1187-1189 
1205 1220 1225 1240 1244 1271 
1317-1320 1323 1334-1335 1350- 
1351 1355 1369 1373 1375 1425- 
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Tissue Origin 


RNA Source 


Hyseq 


SEQ ID NOS: 






Library Name 










1426 1436 1439 1469 1474 147-? 








1482 1546 1587-1588 1532 1596 








1610 1622 1627 1644 165& 1662 








1665-1666 1669 1675-1677 1749 








1786 


salivary gland 


Clontech 


SALOOl 


10 55 97 103 110 140 149 152 158 






198 217-P.18 242-243 256 301 308 








312 321 333 351 354 360 410 437 








448 473 487 494 496 501 S35 555 








569-570 572-573 590-591 624 636 








651 759 762 764 768 771 788 800 








809 826 848 865 879 906-907 92S 








933 963 1016 1020 1025 1040 1046 








loss 1066 1103 1150 1172 1181 








1234 1281-1282 1288-1289 1298 








1315 1320 1333 1336-1337 1346 








13S9 1373 1379 1424 1447 1449 








1474 1482 1492 1494 1498 1511 








1523-1524 1537 1554 1596 1626- 








1627 1636 1652-1655 16SB 1665 








1671-1672 1691-1692 


salivajcy ^Idndl 


Clontftch 


SALS 03 


158 326 1423 1463-1464 




ATCC 


SFBOOl 


1320 1400 












ATCC 


SFB002 


262 736 1025 1253 










skiin 


ATCC 


SFBQ03 


709 1119 1350 L631 1653 


f iJDJTOJl^lelStl 












S INO 0 1 


25 142 146-147 151 155 198 203 








244 260 271 280-281 28S 288 298 








301-302 308 312 334 340 371 398 








408 412 414 416 423 425-427 430 








434-435 445 452 454 478 503 51S 








519 521 523 543 547 549 S5S 559 








563 S69-S70 585 592 604 611 626 








628-629 632 650 659 681 710 714 








719 7S0 764 780 798 829 842 857 








859 866 83V o92 894-895 901 904 








906-907 912 919 935 997-998 1000 








1007-1008 1026-1028 1044 10S5 








1089 1097 1116-1117 1131 1148 








1169 1199 1219 1234 1247 1264 








1279 1316 1320 1326 1341 1343 








1349 1351 1374 1387 1398 1400 








1403 1407 1423 1428 1468 1498 








ISOI 1521 1550 1556 1585 1597 








1636 1638-1639 1645 1653 X€S6 








1662 X671 1675 l€84 1691-1692 








1704 1711 1717 1719 1722 1725- 








1726 1729 1733-1734 1743-1744 








1762 1767 1/BO x/ai> 


skeletal 


Clontech 


SKMOOl 


18 20-21 82 84 101 118 134 148 


muscle 






^ c: n ^ a T r- OOC—OOC otic 0*7A OTT 








289 329 361 412 414 424 440 452 








459 470 488 503-504 537-540 647 








660 &73-6*75 715 773 780 786 830 








905 922 950 963 982 990 992 1020 








1047 1063 1115-1117 1121 1134 








1228 1268 1284 1298 1321 1329 








1336-1337 1343 1409 1413-1414 








1509 1599 1624 1644 1653 1712 


skeletal 


Clontech 


SKM002 


168 1683 1712 


muscle 








skeletal 


Clontech 


SKMs03 


235-236 1409 


muscle 








skeletal 


Clontech 


SKMS04 


235-236 


tnuscle 








spinal cord 


Clontech 


" SPCOOl 


4 9 11 17 30-31 35-36 43 46 60 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



adult spleen 



82 as 32 94 108 1X0 
167 198 204-205 210 
2S9 277 280-281 300- 
317 372 379 387 392 
430 433 448 467 473 
509 313 519 S24 S26 
547 549 551 S59 567 
607 Sl6-fil7 623 625 
652 657-658 670-671 
682 709 711 715 719 
749-7S0 753 775-777 
809 820 832 834-836 
855 858 661 864 871- 
898 906-908 917 919 
944 970 9S5 990 992- 
1039 1053 1059 1065 
1077 1082 1085 1097 
1116-1117 1128 1134 
1174 1192--1194 1215 
1243 1283 1294 1307 
1323 1327 1330 1350 
1356 1359 1368 1375 
1407 1423 1429 1437 
14S4 1470 1482 1492 
1511 1529 1S38 1548- 
1571 1578 1598 1600 
1627 1630 1639 1646 
1670 1686 1696 1740 
1771 



116 139 157 
215 229 256 
302 304 315 
419 426-427 
487 489 506 
537-540 543 
569-570 5S3 
637 649-650 
673 679 681- 
728-729 734 
781 789 731 
847-849 854- 
872 875 884 
924 934 942 
993 998 1013 
1072 1075 
1103 1109 
XlSl 1170 
1225 1241 
1312 X320 
1353-1354 
1400 1406- 
1443 1448 
1501 1508 
1549 1565 
1614 1625 
1651-1652 
1751 1755 



Clontech 



stomach 



117 312 326 348 424 426-427 431 
845 866 1320 1330 1333 1344 
13SS-13S7 1371 1387 1397 1446 
1538 1579 1669 1686 1739 1767 



Clontech 



STOOOl 



thalamus 



10 15-15 61 68-69 100 IIV 149 
197 201 227-228 231 249 273 280- 
281 287 291-292 302 312 358 362 
426-427 430 446 462 475 479 S3S 
5^7 620 630 6S1 662-664 722 739 
780 782 7B5 846 919 960 964 966- 
967 976 1008 1012 1032 1042 1063 
1071 1135 1170 1208 1234-1235 
1259 1277 1280-1281 1322 1349 
13S9 1369 1449 1468 1474 1478 
1487 1493 1498 1SS7-1SS9 1622 
1634 1651 1653 1729 



Clontech THA.002 



9 11 25 85 87 112 137 146 180 
190 198 206 210 212-213 235-236 
239 261 268-269 279 290 301 325 
333-334 341 351 356 364-365 379 
38B 353 39€ 419-420 441-442 458 
477 483 SOa 525 531 549 567 606 
608-609 647 681 715 725-727 736 
774 782 784 794 827 883 890-891 
899-900 961 997 939-1001 1004 
1034 1055 1097 1129 1144-1145 
1150-1151 1157 1172-1173 1177 
1193-1194 1208 1220 1245 1280 
1305 1345 1355 1369 1434-1435 
1440-1441 1454 1496 1546 1S49 
1562 1572 1578 1590 1594 1613- 
1614 1640 1651-1652 1671 1687- 
1686 1703 1743-1744 1746-1747 
1753 



thymus 



Clontech 



THMOOl 



44-45 54 57-58 62-64 79 104 123 

126 134 153 193 212-213 218 242- 

243 258 274 277 279 297 301 307 

327 330 333 342 351 358 371 410 

430 445 465-466 468 471 483 487 

493 503 506 509 517 526 535 537- 
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Tissue Origin 


RWA Source 






SEQ 


ID NOS: 










540 


546 548 554 


S6^ 


584 


586 590- 








591 


604 612 621 63fi 


-640 


645-647 








649 


656 660 655 67C 


698 


710 720 








728 


735 739 746 7SS 


762 


766-767 








775 


-777 780 784 


-785 


800 


802 809 








824 


826 828 845 851 


858 


-S59 864 








866 


870-871 878 


884 


887 


892 899- ' 








900 


927 930-931 


967 


983 


986 990 








992 


999 1014 1029-1 


030 


1033 1059 ' 








106 


6 1073 1103 


1107 


1113 1116- ; 








1117 1119 1140- 


1142 


1158 1163 








1172 1177 1195 


1206 


1209 1213 








1216 1216-1219 


1221 


-1222 1227 








127 


1 1277 1282 


1320 


1329 1349 








136 


7 1369 1383- 


1384 


1417 1419 








1423 1425-1427 


1448 


1477 1488 








1493 1536 15S4 


1620 


1644 1646 








154 


9 1654-1655 


1661 


-1662 1669- 








1670 1674 1676- 


1677 


1685-1688 


— — .. . — 







1707 1711 1731- 


1732 


1737 




Clontecn 


l'HMc02 


5-9 


15-21 25 33 


3S- 


36 43-45 48 








50-51 54-55 60 


75 e 


3 87 


89 93 








58-100 102 105 


112 


117 135-137 








141 


143 146 157 


167 


169 


192 196 








211 


217-219 222 


224 


229 


233 235- 








236 


240-241 244 


251 


-252 


256 261- 








262 


268-269 286 


288 


290 


295 297 








301- 


•302 309-310 


315 


-317 


321 324 








327 


334 342 350 


352 


-353 


360 370- 








373 


382 384 400 


403 


410 


414-416 








424 


430-431 436 


445 


454- 


456 461 








464- 


467 470 472 


474 


-476 


483 486 








497 


500 504 506 


513 


516 


519-520 








524 


526 530-531 


534 


537- 


540 549 








554- 


555 565-566 


5^9 


-570 


572-573 








575- 


577 566-587 


595 


603- 


604 606 








612 


630-632 634 


636 


647 


650 657- 








660 


666-667 669 


673 


-675 


678 698 








700 


703 708 720 


725 


^726 


731 738- 








739 


743-744 750- 


-753 


757 


759 763- 








765 


767 772-779 


787 


789- 


790 798 








800 


810 623 829 


834- 


•836 


841 848 








854- 


656 859, 861 


864 


870- 


871 881 








890- 


891 898 908- 


-909 


913 


928 933 








941 


949 958 961 


963 


967 


969 975 








981 


986 988-990 


992 


999 


1007- 








1008 


1014 1016 1039 


1041 


1073- 








1074 


1079 1089 1097 


1109 


1114- 








1117 


1122 1131 1140- 


-1141 


1144- 








114S 


1163 1172 1175- 


1177 


1186 








1196 


1198 1206 1211 


1216 


1220 








1223 


1227 1234-1243 


1261 


-1262 








1267 


1271 1280-1281 


1284 


1290 








1308 


1317-1320 1322 


1324 


-1325 








1327 


1330 1334-1335 


1339 


1346 








1350 


-1351 1355 1357 


1360 


1370 








1374 


1377-1379 1386 


1389 


-1390 








13 92 


1397 1400 1402 


1406 


-1407 








1417 


X423 1425-1427 


1440 


-1441 








1466 


1474 1477 1483 


1493 


1498 








1504 


1506 1525 1536 


1545 


1549 








1566 


1S94 1598-1 


600 


1608 


1611 








1614 


1621 1623 1 


625 


1632 


1639 








1641 


1644 1647 1 


649 


1653 


-1656 








1658 


1S62-1663 1671 


1673 


1678- 








1681 


1686-1688 1693 


1705 


1707 








1711 


1717-1718 1726- 


1727 


1731- 








1733 


1737-1738 1743- 


1745 


1758- ,^ 








1761 


1771-1772 1779 


1786 
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Tissue Origin 
thyroici gland 



RNA Source 



Clontech 



trachea 



Hyseq 
LibraryKame 



THRO 01 



Clontech 



TRCOOl 



SEQ ID NOS: 



280-281 
302 309- 
341-342 
363 368 
398 400- 
430-431 
454-455 
484-485 
SOO-SOl 



4 9-10 20-21 37-3y 48 50-51 54" 
57 60-61 65-6G 71 83 94-96 98- 
100 102 104 110 112 115^117 119 
123 127 133 136-137 140 149 152- 
153 155 158 163-164 1G8-169 171 
186 190-192 197 201-203 219-220 
229 233-237 246-247 253 256 258 
262 265-266 268-269 277 
284-286 288-289 298-299 
311 317 321 326 332 335 
344 348 350 354 358-359 
371-373 362-383 385 394 
401 411 414-41S 421 424 
433-436 443-446 450-452 
458 472-474 476-478 482 
487-488 490-494 496-497 
503-504 506 S09-S13 516-517 Sl9 
524 526-527 529 535-540 547 549 
562 564 569-570 575-576 S88 S94- 
S9S 601-602 604 606 610 612 615- 
617 619-623 628-630 634-635 642 
647 649-6S1 660 662-665 668 670 
681 690-694 696 698 700 709 721 
727-729 732 734 738 740-741 743 
745 750 759 761 763 76S 770 773 
780 785 795-796 798 802 804 823- 
824 826 828 833 838 841-645 847 
849 857-860 867 874-87S 878 880- 
881 887-888 890-892 894-695 898 
908 910-911 913-914 522-923 926- 
927 929 932-334 937 939 941-942 
948 953 957 961 963-964 966 978- 
979 981-982 937 990 992 1001 
1004-1006 lOia 1014 1020 1024 
1033 1038-1039 1044 1047 1050 
1052-1054 1055 1058 1068 1070- 
1071 1077-1079 1088 1094-1097 
1105-1106 1112-1113 1116-1117 
1124 1126 1128-1129 1131 1134 
1136-1137 1142-1143 1146-1147 
1149-1150 1156 1161-1164 1167 
1170-1173 1177-1181 1190 1192 



1197 1200 1204 
1217 1219 1222 
123S 1241 1245 
1258 1260 1262 
1286-1289 1299 
1330-1332 1334- 



1206-1209 1214 
1230 1232-1233 
1247 1254 12S7 
1271-1273 
1306 1314 
1335 1342 



1283 
1320 
1345 
1374 



1349 1365-1367 1370-1372 
1381 1394 1407 1419 1428' 1436- 
1437 1440-1441 1443 1446-1449 
1454 1459 1461-1462 1468 1470- 
1471 1475 1477 1479 1482 1491 
1497-1498 1504-1505 1507 1513 
1522 1524-1526 1528 1531 XS34 
1536-1537 1548 ISSO 1SS3 1S5S- 
1559 1562 1567 1378 1590-1591 
1597 1599-1601 1612 1614 1616 
1619-1620 1622 1624-1626 1628 
1631-1632 1634 1536 2639 1644- 
1645 1648 1651 1653-1656 1658 
1660 1662-1663 1667 1669 1671 
1675 1678-1681 1683-1686 1689 
1691-1692 1703 1709-1711 1717 
1724-1726 1729 1734 1737-1736 
1740 1743-1744 1749 1753 1759- 
1761 1770 1777 1766 



9 29-31 46 48 87 104 107 110 135 
1S8 222 262 266 2S6 301 318 331 
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Tissue Origin 


RWA Source 


H/seq 
Library Name 






SEQ 


ID NOS: 










352 


372 3 


77 384 


414 


424 


445-446 








454 


473 4 


74 491 


496 


560 


579 588 








£93 


597 6 


07 612 


626 


681 


702 719 








810 


859 866 878 


894- 


895 


912 916 








S22 


932 935 104 


6 1075 1080 1099- 








1102 


1113 


1208 


1215 


1232 


-1233 








1237 


1281 


1312 


1385 


1387 


140S 








1414 


1424 


1430 


1437 


1447 


1S05 








1569 


1579 


1586 


1600 


1641 


1653 








1567 


1671 


1676- 


1677 


1683 


169X- » 








1692 


1711 


1717 


1726 


1772 




uterus 


Clontech 


UTROOl 


l*? 1 


9 25 


41 46 


57-58 


61 


89 104 








108 


139 152 174 


196 


200- 


201 206 








263- 


265 274 290 


387 


408 


420 438 








44«J 


448 4 


52 473 


491 


493 


499 503 








506 


513 519 522 


526 


530 


542-543 








560 


601 610 632 


659 


665 


720 751 








773 


780 833 845 


857 


872 


877 912 








929 


934 937 996 


1009-1011 1018 








1050 


1075 


1107 


1124 


1170 


1219 








1259 


1279 


1287 


1310 


1320 


1323 








13 43 


-1344 


1375 


143 7 


1451 


-1452 








1478 


1481 


1498 


1519 


1521 


1536 








1552 


1579 


1597 


1602 


1606 


1620 








1626 


-1627 


1649 


1652 


1661 


1670 








1719 


1722 


-1723 
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TABLE 2 



PCT/USOO/34263 



SEQ 
ID 
NO: 


ACCESSION 
ITUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATER^4AN 
SCORS 


k ] 

IDENTITY 


1 


y4l736 


Homo 
sapiens 


Human PR6ii14 protein 
sequence . 


1398 


100 


2 


Y66656 


Homo 
sapiens 


Membrane -bound protein 
PR094 3 . 


2389 


y9 


3 


AF113136 


Homo s ap iens 


receptor-associated- 
kinase-M; IRAK- M 


3043 


100 


4 


AP017S06 


Mus musculus 


Zn-lS transcription factor 


63S1 


77 


5 


X02761 


Homo sapiens 


f ibronectir, precursor 


1053S 


'^98 — : 




X02761 


Homo sapiens 


fibronectin precursor 


8990 


B9 * 


B 


X02761 


Homo eapiene 


fibronectin precursor 


12S64 




9 


AJ011679 


Homo sapiens 


Rata6 GTPase activating 
protein, GAPCenA 


5251 


99 


10 


W88 501 


Homo sapiens 


Human stomacli carcinoma clone 
HP104 IS -encoded protein. 


2381 


100 


11 


AF117754 


Homo sapiens 


associated protein complex 
component TRAP240 


11336 


98 


12 


Z97630 


Homo sapiens 


similar to ANK3 (ankyrin 3» 
node of Ranvier ( ankyr in 
GJ ) ) 


896 


100 


13 


Y5e620 


Homo sa.piens 


Protein recpiiatln^ ^ene 
expression PRGE-13 , 


1894 


98 


14 


AF2134S7 


Homo 
sapiens 


tri^greringf receptor expressed 
on myeloid cells 2 


1238 


100 


le 


AF2334S3 


Homo sap i ens 


w I- -L 1 J X. i\ /vVrtJ *r 


3124 


99 


17 


AF201303 


Homo sapiens 


dh£r oribeta- binding protein 
RXP60 


3130 


98 


18 


AF064205 


Homo sapiens 


dynactin 1 piso isoform 


6377 


100 


19 


U00059 


B cerevislae 


1 j> J. ^ A wp 


174 


26 


20 


AB032903 


Homo saipiena 


reductase Isolocp 


3 

idoi 


99 


21 


AB032903 


Homo sapiens 


cfuanoslne tr.onopliosphate 
reductase isolog 


14 85 


99 


22 


AF140507 


Homo sapiens 


Ca2 + / caXmo^ul xn.— dependent 
protein kinase kinase beta 


3083 


99 


23 


AF140S07 


Homo sapiens 


Ca2 +/ calmodulin -dependent 
protein kinase kinase beta 


2300 


55 ■' 


24 


AJ2 89131 


Homo sapiens 


chondroi tin 4-0- 
sulf otrans f erase 


2211 


99 




U334e0 


Homo 
sapiens 


DNA-directed SNA polymerase 
I, largest subunit 


8777 


98 


26 


Y44488 


Homo sapiens 


ACRP30R2 variant protein. 


X387 


100 


27 


U43 701 


Homo sapL&ns 


ribosomal protein L23a 


791 


100 


28 


U02032 


Homo sapiens 


ribosomal protein L23a 


767 


97 


■59- 


¥41324 


Homo sapiens 


Human secreted protein 
encoded by gene 17 clone 
HNFIY77 . 


1083 


99 


30 


W71743 


Homo sapiens 


Human ubiguitin conjugation 
system protein 2. 


715 


90 


"31 


W71749 


Homo sapiens 


Human ubiquitin conjugation 
system protein 2. 


631 


82 


0^ 


/ir <cO xyj. / 


Homo sapiens 


long-chain 2 -hydroxy acid 
oxidase HAOX2 


1811 


100 


33 


229481 


Homo sapiens 


3-hydroxyanthranilic acid 
di oxygenase 


1S07 


99 


34 


AB001451 


Homo sapiens 


Sck 


2869 


100 


35 




Homo sapiens 


precursor polypeptide (AA -34 
to 287) 


1657 


99 


36 


Y00644 


Homo sapiens 


precursor polypeptide {AA -34 
to 287) 


1104 


98 


37 


¥78795 


Homo sapiens 


Human antisuai'2 (A2~2) amino 
acid SGqu&ncei * 


3586 


78 


38 


¥78795 


Homo sapiens 


Human anti2uai-2 (AZ-2) amino 
acid sequence - 


4726 


99 

i— 
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SMITH- 
WAT EREVAN 
SCORE 


IDENTITY 


39 


Y78795 


Homo sapiens 


Human antizuai-2 (A2-2) amino 
acid sequence . 


3556 


77 


4 0 


U93121 


Homo sapiens 


M-phase phosphoprotein-l 


3747 


100 


41 


y427S0 


Homo sapiens 


Human calcium biriding protein 
1 (CaBP-1) . 


795 


100 


42 


AF262626 


Homo sapiens 


latexin 


1189 


100 


43 


002150 


Homo sapiens 


Human secreted protein, SEQ 
ID WO: 6231. 


384 


94 


44 


U19617 


Mus muscuius 


El£-1 


2724 


8B V 


4S 


U19617 


Mus mu-'::ulu:5 


Elf-1 


20^2 


86 — — — 


46 


AF100758 


Homo sapiens 


osteoinductive factor OIF 


1538 


100 


47 


Y8 7591 


Homo sapiens 


Human SPROCTTV-l ptrbcein, SEQ 
ID NO:24. 


1737 


.... — 


49 


X04145 


Homo s ap i ens 


T3 gatiuna precursor (aa -22 to 
160) 


942 


Q Q 


51 


X63547 


Homo sapiens 


oncocjene 


584S 


99 


52 


M94043 


Rattus 
norvegicus 


rab- related GTF- binding 
protein 


1089 


96 


53 


L317B3 


Mus mueculus 


uridine kinase 


917 


71 


54 


XS3973 




transcription factor 


4486 


98 


55 


AF224741 


Homo sapiens 


chloride channel protein 7 


4128 


99 


56 


W74805 


Homo sapiens 


Humsin secreted protein 
encoded by gene 77 clone 


1491 


100 


57 


ZS0907 


riomo sap X en 3 


Human TBC-l cDNA £roin second 
transcript. 


4 824 


100 


58 


D79994 


rxoiuvj scip xsns 


similar to ankyrin o£ 
Chroma ti urn vlnosum. 


6089 


99 


59 


D79994 




3u.inx x^x CO ariKyTrxn on 


4014 


91 


60 1 


Y59738 


Homo sapiens 


Human normal ovarian tissue 

cJer i vef3 r»T*r> h ^ ■? n 1 


601 


100 


61 


AB031069 


Homo sapiens 


protein containing CXXC 
doma in 1 


1390 


100 


S2 


Y^6660 


Homo 

SapXGIiB 


Membrane -bound protein 
PR07S3 . 


2492 


99 


63 


Y6G660 


Homo 
sapiens 


Membrane -bounci protein 
PR0783 . 


1709 


99 


64 


370011 


Racnus sp. 


tiricarboxylate carrier 


895 


55 


65 


AF139518 


Rattus 
nerve gicus 


A-kinase ^chor protein 


178 


24 


66 


M29666 


Homo sapiens 


Homo sapiens DH1308__1 clone 
secreted protein , 


157 


30 


67 


AJ24573 8 


Homo sapiens 


claudin-is 


1206 


100 


68 


AF099138 


Rattus 
norvegicus 


GL,UT4 vesicle protein 


4183 


87 


69 


AF099138 


Rattus 
norvegicus 


GLUT4 vesicle protein 


4906 


86 


70 


282059 


Caenorhabdit 
is elegans 


Similarity to Drosophiia ring 
canal protein comes from 
this gene 


1285 


44 


71 


AF224278 


Homo sapiens 


£>MEPA1 protein 


1282 


100 


72 


AF126426 


Homo sapiens 


neurotrimin 


1809 


100 


"73 


Y41652 


Homo 
sapiens 


Human MSK2 protein sequence. 


2065 


99 


74 


Y41652 


Homo 
sapiens 


Human HEK2 protein sequence. 


1207 


100 


75 


AF18a622 


Mus mus cuius 


selectively expressed in 
embryonic epithelia protein- 1 


1485 


74 


76 


AE000406 


Eschearichia 
coli 


putative DWA topoisomerase 


9S0 


100 


77 


X99302 


Homo sapiens 


Popl 


655 


100 


78 " " 


AL136538 


Schizosaccha 

romyces 

pombe 


similarity to S. cerevisiae 
ktii2 protein 


210 


31 


~79~ 


AP129756 


Homo sapiens 


G4 


1554 


99 ^ 
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DSSCRIPtrOW 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


80 


AL69676S 


Homo sapiens 


djeSBBlS .2 
(phosphatidylserine 
decarboxylase (PSSC, EC 
4.1.1.65) ) 


2033 


100 ' 




AIi096768 


Hotuo s^pienis 


dJ858Bl6 -2 
(phosphatidylserine 
decarboxylase (PSSC, EC 
4.1.1.65) ) 


1220 


96 


82 


XS73S1 


Homo sapiens 


1-8D 


677 


98 


83 


AC005594 


Homo sapiens 


R26984 1 


2700 


98 V 


84 


X731X3 


Homo sapiens 


fast MyBP-C 


S9S9 


S9 


85 


AF09733Q 


Homo sapiens 


HI chloride channel; p64Hl; 
CIjIC4 


130S 


99 


66 


AB01B423 


Mus musculus 


SH2 domain- containingr protein 


1360 


78 


87 


AF272151 


Homo sapiens 


adaptor protein CIKS 


3084 


99 


86 


AF196329 


Homo 
sapiens 


triggering receptor expressed 
on monocytes 1 


1214 


100 


89 


A8016879 


Arabidopsis 
bhaliana 


contains similarity to pre- 

mRNA splicing 

f actor-gene_id: MRB17 . 2 


634 


36 


50 


Aai33 721 


Mus musculus 


homeodomain protein 


6S4 


S7 


91 


AJ242864 


Mus mus cuius 


phtf protein 


619 


61 


92 


A61971 


unidentified 


MCSP 


11676 


99 


93 


Y93365 


Hotoo sapiens 


Haman PRO1250 (UNQ633) amino 
acid sequence SBQ ID N0:8S. 


3890 


100 


94 


Y87231 


Homo sapiens 


Human signal peptide 
containing protein HSPP-8 
SEQ ID NOi8. 


1031 


ioo 


95 


AF227741 


Rattus 
norvegicus 


protein kinase WNKl 


2428 


95 


96 


AF227741 


Rattus 
norvegicus 


protein icinase WNKl 


1961 


94 


97 


Y92513 


Homo sapiens 


Human OXRE-lO. 


1626 


100 


98 


AL021366 


Homo sapiens 


CICK0721Q.3 (Kinesin related 
protein) 


3423 


100 


99 


AC00S733 


Homo sapiens 


R33083_l 


1974 




100 


Y95293 


Homo sapiens 


Human GEF containing NElC-like 
Icinase substrate sGNK. 


4092 


99 


101 


ALllBSOl 


Homo sapiens 


dJll9lN16.1 (A novel protein 
(translation of the cDNA 
DKFZp566A0946 , Em:Ai:i0S0069) ) 


1S09 


100 


102 


Aa006267 


Homo sapiens 


CipX-lilce protein 


3233 


100 


103 


AF1007S3 


Homo sapiens 


ancient ubiquitous 46 IcDa 
protein AUPl 


2042 


96 


104 


AB015982 


Homo sapiens 


serine /threonine kinase 


4718 


100 


lOS 


AF151074 


Homo sapiens 


KSPC240 


831 


64 


106 


M35S22 


canis 
familiar! 3 


GTP-binding protein (rab7) 


354 


50 


107 


Ii99800 


Homo sapiens 


WTII-l nerve protein, 
facilitates regeneration of 
nerve cells. 


2337 


93 


108 


AF125533 


Homo sapiens 


NADH- cytochrome bs reductase 
isoform 


12^0 


93 


109 


AC005614 


Homo sapiens 


J O ? ^ 


3369 


99 


110 


AF064729 


Homo sapiens 


RAN binding protein 16 


3285 


100 


111 


XS2425 


Homo sapiens 


InterleuJcin 4 receptor 


4496 


100 


112 


Y41686 


Homo 
sapiens 


Human PR0274 protein 
sequence . 


2285 


100 


113 


W15506 


Homo sapiens 


Mitogen activating protein 
kinase ERKi- 


1991 


100 


114 


y7l071 


Homo sapiens 


Human membrane transport 
protein, MTRp-16. 


1190 


99 


115 


AL04S548 


Homo sapiens 


dJ398G3.1 (ortholog of rat 
CPG2) 


3497 


99 


116 


AF189817 


Mus muscuius 


evectin-2 


1124 


90 


117 


W30a91 


Homo 


Human cytostatin III protein. 


71S 


93 ^ 
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sapiens 








110 


AF116G18 


Homo sapiens 


PRO103 8 


1469 


100 


119 


^08915 


Homo sapiens 


alpha 4 protein 


1748 


lOO 


12C 


AF098070 


Drosophila 
tnelanogasCer 


Lisl homolog 


192 


39 


121 


AF05243 2 


Homo sapiens 


katanin pBO subunit 


181 


37 


122 


\r70743 


Homo sapiens 


PSEQ-1 protein encoded by 
NSEQ gene associated with 
matrix remodelling. 


2637 


98 


123 


AF083246 


Homo sapiens 


HSPC028 ■■" 


2132 


100.. 


124 


727096 


Homo sapiens 


Human viral receptor protein 
(ACVRP) . 


833 


99 


125 


M63109 


lieifihmania 
major 


glycoprotein 96-92 


172 


27 


126 


1775467 


Drosophila 
melanogaster 


Atu 


935 


36 


127 


Z68220 


Caenorhabdi t 
is elegans 


similarity to Human ADP/ATP 
carrier protein 


438 


43 


128 


AF095927 


Rattus 
norvegicus 


protein phosphatase 2C 


1927 


94 


129 


W 929b 8 


Homo sapiens 


Human 2:sig44 protein. 


463 


100 


130 


AF115391 


Lactobacillu 
s sakei 


ribokinaoe RbsK 


508 


37 


131 


X93498 


Homo sapiens 


21 -Glutamic Acid-Rich Protein 


X2S0 


100 


132 


X93498 


Homo sapiens 


21 -Glutamic Acid- Rich Protein 




87 


133 


W528ia 


Homo sapiens^ 


Human DBI/ACBP -like protein 
(DBIH) . 


705 


97 


134 


ya4444 


Homo sapiens 


Amino acid segvtence of a 
human RNA- associated 
protein . 


3230 


100 


135 


"M69181 


Homo sapiens 


non-muscle myosin B 


189 


20 


136 


W74&82 


Homo sapiens 


Human secreted protein 
encoded by gene 154 clone 
HE6Pli83. 


480 


100 


137 


W78200 


Homo sapiens 


Human secreted protein 
encoded by gene 75 clone 
HHGAU81 . 


855 


99 


138 


AL033S20 


Homo sapiens 


da34 9A12.1 (similar to 
KIAA0701 protein) 


424 


39 


139 


AF020261 


San t alum 
album 


proline rich protein 


119 


30 


140 


X70394 


Homo sapiens 


zinc finger protein 


X634 


100 


14 3. 


Y06439 


Homo sapiens 


Human protease HUPM-B. 


936 


100 


142 


Z68493 


Caenorhabdi t 
is elegans 


predicted using Genetxnder 


365 


42 


143 


AB6i8107 


Arabldopsis 
thaliana 


ADP-ribosylation factor- like 
protein 


596 


65 


144 


AF1^1483 


Homo sapiens 


HSPC134 


580 


51 


145 


Y84902 


Homo sapiena 


A. human proliferation and 
apoptosis related protein. 


480 


100 


146 


AB004906 


Ipotnoea 
purpurea 


transposase 


146 


20 


147 




Arabidopsis 
thaliana 


F3F19 .18 


647 


31 


148 


W751S5 


Homo sapiens 


Human secreted protein 
encoded by gene 41 clone 
HNTME13 . 


1494 


98 


149 


AF056490 


Homo sapiens 


cAMP- specific 
phoephodieeteraoe 8 A 


3 710 


99 


ISO 


YS8171 


Homo 
sapiens 


Human hydrolase homologue 
HHH-7. 


785 


99 


151 


010397 


S a c c bar omy c e 
s cerevisiae 


Yhrl48wp 


515 


S3" 


152 


X73478 


Homo sapiens 


phosphotyroeyl phosphatase 
activator 


1719 


99 


153 


AL049697 


Homo sapiens 


dJ382IlQ.5.1 (novel protein 


2034 1 99 * 
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similar to arginy 1 - tRNA) 






154 


AF169802 


Homo sapiens 


cytochrome bS reductase b5R.2 


1455 


99 




X94703 




rab28 


1126 


99 


156 


Y25716 




Human secret 6Ci protein 
encoded from gene 6 , 


1471 


100 


1S8 


W77404 




Secreted saiivary polypeptide 
zsig32. 


937 


100 r 


159 


VI Afi 


Honio Scipxcns 


Human nrritf*!!^ lci_r\AS& 

inhibitor- 2 {PKl-2) . 


383 


100 ■ 


160 








2395 


100 ' 


161 


W54040 


Homo sapiens 


Human interferon-inducible 
protein, xij.rx* 


484 


98 


162 


AIi022724 


Homo sapiens 


dJ4l3H6.1,l <hamster 
Androgen- dependsnt Exp re s sed 

protein) (isoform 1) 


1357 


100 


163 


AF125535 


Homo sapisns 


pp21 homo log 


193 


45 


164 


G03 632 


Homo sapiens 


Human siacreted protein, SEQ 
ID NO : 7713 . 


463 


97 


165 


AJ2S0S39 


Homo sapiens 


serine /threonine protein 
kinase 


1442 


71 


166 


Ij09649 


Zymomonas 
mobilis 


zm2 


173 


37 

_ 


167 


Y73337 


Homo sapiens 


HTRM clone 194 4530 protein 


1204 


100 


168 


W8864S 


Homo sapiens 


secreted protein encoded by 
gene lu cJ.one xiuivrw /x . 


1084 


100 


169 


AF214731 


Homo sapiens 


ATP-dependent RNA helicase 


4402 


100 


170 


AE000871 


Methanobacte 
rium 

t he irttioau t o t X 
ophicum 


conserved protein 






171 


fbtSH, 


Homo sapiens 


iiviman seci-eu.eii ptoi-eiii 


821 


100 


172 


a P"?*? end A 

H«iL ^ O V# % ^ 




HSNFRK 


2904 


100 


173 


t\kJ 3 J^t O 






779 


100 


174 


D43949 




Tbis gene is novel > 


3202 


100 


175 


Y07923 


Homo sapiens 


GTP-binding protein 


1205 


100 


1-7^ 
JL / b 


Vi3 \J ^ J C} 


sapiens 


Human DPI Jiomologus protein* 


966 


100 


177 


Y4167S 


Homo sapiens 


Human channel-related 
molecule HCRM-3 . 


1122 


100 


278 


y41674 


Homo sapiens 


Human channel-related 
molecule HCRM-2 , 


936 


99 


179 


AF220492 


Homo sapiens 


krueppel-like zinc finger 
protein HZF2 


4100 


99 


180 


X03084 


Homo sapiens 


Clq B- chain precursor 


1240 


100 


181 


057344 


Nus musculus 


Meis3 


1613 


69 


183 


tJ57344 


Mus musculus 


Meis3 


1743 


86 


184 


U5.734 4 


Mas musculus 


Meis3 


1070 


86 


185 


AF033120 


Homo sapiens 


pS3 regulated PA26-T2 nuclear 
protein 


1389 


58 


186 


AF200357 


Mus musculus 


pantothenate kinase 1 beta 


1605 


82 


187 


W75058 


Homo sapiens 


Human secreted protein 
encoded by gene 2 clone 
HLDBG33 . 


1188 


99 


- 188 


AJ292S29 


Homo sapiens 


suppressor of sterile four 1 


2424 


100 


130 


X54T34 


Homo sapiens 


protein-tyrosine phosphatase 


3705 


100 


191 


y22203 


Homo sapiens 


Human calcium-binding 
phosphoprotein, CBPP-i, 
protein sequence. 


1063 


99 


192 


W63692 


Homo 
sapiens 


Human secreted protein 12. 


1975 


100 


193 


We7772 


Homo sapiens 


Human serum glucocorticoid - 
regulated kinase (H- SGK2 ) 
polypeptide , 


2605 


99 

4 
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3-94 


AF0842S9 


Mus mus cuius 


bromodoma in- con Cain ing 
protein BP75 


693 


54 


195 


Y007S2 


Rattus 
norvegicus 


serine dehydratase (AA 1 - 
327) 


994 


61 


196 


VJ9S349 


Homo sapiens 


Human foetal brain secreted 
protein f hl70 7 . 


2596 


100 


197 


AB0288S9 


Homo sapiens 


hDj9 


1890 


100 


198 


W9S633 


Hocno sapiens 


Homo sapiens secreted protein 
gene clone hm236__l. 


1614 


100 


199 


Y44277 


Homo 
sapiens 


Human nucleic acid methylase- 
2. 


2096 


99 


200 


AB030039 


Homo sapiens 


hPACPLl 


22S8 


100 


201 


X54162 


Homo sapiens 


S4 Kd autoantigen 


2918 


99 


202 


G02061 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6142. 


SS8 


99 


203 


X138BS 


Nicotiana 
t aba cum 


extensin (AA 1-620) 


185 


33 


204 


J04204 


Bos taurus 


32 kd accessory protein 


1837 


100 


205 


Cr04204 


Bos taurus 


32 kd accessory protein 


1101 


100 


201 


Y87283 


Homo sapiens 


Hurr^n signal peptide 
containing protein HSPP-60 
SEQ ID NO: 60. 


1318 


100 


208 


yO2860 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 6S. 


936 


98 


209 


Abl21889 


Homo sapiens 


dai076E17.1 {KIAA0 823 protein 
(continues in AIi023 803)) 


694 




210 


AF226732 


Homo sapiens 


NPD007 


1345 


76 


211 


X66295 


Mus musculus 


Clq C chair. 


970 


73 


212 


Z29328 


Homo sapiens 


Ubiqui tin- conjugating enzyme 
UbcH2 


966 


100 


213 


Z29328 


Homo sapiens 


Ubiquitin-conjugating enayme 
UbcHa 


542 


98 


214 


AJ002030 


Homo sapiens 


progresterone binding protein 


1163 


100 


215 


X70649 


Homo sapiens 


member ot dead box protein 
family 


3933 


100 


2ie 


AF2S055e 


Homo sapiens 


claudin-2 


1169 


99 


217 


AIi021453 


Homo sapiens 


dJ82lDll-l (PUTATIVE protein) 


259 


100 


218 


Yd 8565 


Homo sapiens 


UDP-GalNAc : polypeptide N- 

acetylgalactosaminyltransfera 

se 


3331 


99 


219 


Y944S2 


Homo sapiens 


Human inflammation associated 
protein 


2067 


lao 


220 


AI*035S21 


Arabidopsis 
thaliana 


putative pirotein 


315 


42 


221 


AliO3l706 


Schizosaccha 

romyces 

pombe 


putative proline- trna 
synthetase 


eix 


41 


222 


J\iaXvlf / Jo 


Schizosaccha 

romyces 

pombe 


WD xepeat protein 


626 


40 


223 


X5 2493 


Glycine max 


DKA-directed RNA polymerase 


136 


23 


224 


.n±JKf -J O w 


Homo sapiens 


CU9/9N1*! (aiJ979Nl , 1) 


5199 


98 


225 


AB032401 


Mus musculus 


mnxDj 4 


X /t> J. 




226 


AB0324O1 


Mus musculus 


mmD 3 4 


1986 


92 


227 


X83502 


Saccharomyce 
s cerevisiae 


Ji007 


112 


26 


228 


X83S02 


Saccharomyce 
s cerevisiae 


ai007 


79 


25 


229 


AF143723 


Homo sapiens 


heat shock protein HSP60 


2557 


99 


230 


Y6€€77 


Homo 
sapiens 


Membrane -bound protein 
PR0828 . 


982 


100 


231 


AB027466 


Homo sapiens 


spondin 2 


1756 


99 


232 


W95634 


Homo 
sapiens 


Homo sapiens secreted 
protein. 


1391 


100 


233 ' 


K00365 


Homo sapiens 


Human cyclin Bl . 


2218 


33 ... ^ 


234 


YS3762 


Homo sapiens 


A GTP- binding polypeptide 


1017 


100 T 
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TABLE 2 



PCT/USOO/34263 



SEQ 
ID 
MO: 



235 



236 



235 



24X 



243 
24 4 



ACCESSION 
NUMBER 



Z50749 



Z50749 



AB026491 
AJ270205 



SPECIES 



Homo sapiens 



Homo sapiens 



AB030189 



WS653B 



W56S38 



AF1S5107 
AL031320 



Homo sapiens 



EntodiniunT 
caudatum 



Mus musculus 



Homo sap X ens 



Mpxens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



DESCRIPTION 



designated RAQ. 



yeast sds22 horoolog 



yeasc Sdg2 2 homolog" 
PICKl 



putative 

phospha t idyl inos i tol - 4 - 
phosphate S- kinase 



contains transmembrane (TM) 
region and ATP binding region 



Human hedgehog interacting" 
protein (HIP) . 



Human hedgehog interacting 
protein (HIP) , 



NY-REN-37 antigen 
NY-REN-37 antigen^ 



SMITH- 
WATERWAN 
SCORE 



1800 



17S4 



2137 
114 



710 



3785 



3436 



996 



ciiT2bN2*l (novel protein 
similar to yeast and 
bacterial cytokine 
deaminase) 

sociium channel beta 2 subunit 



) 10 QS 



IDENTITY 



100 
^8 



99 



99 



100 



99 



246 



AL078599 



Rattus 
norvegicus 



162 



Homo 



sapiens 



da991C6.l (novel protein 
similar to C. elegans 
FSSA12.9 (Tr:P91086)) 



239: 



30 



98 



248 



249 



250 



Saccharottiyce 
s cerevisiae 



Ydr386wp; CAIt 0,12 



y417l9 



AB029434 



Homo 
sapiens 
Homo sapiens 



Human PR0864 protein 
sequence . 



191 
1079 



Rattus 
norvegicus 



ghreXin 



precursor 



carnitine/acylcarnitine 
carrier protein 



611 
246 



37 



100 



252 



255 



2S6 



2S7 



258 



~559- 



Y94873 



Homo 
sapiens 



Human RIP- interacting factor 
RIF. 



W59878 



mo 
sapiens 



Human protein clone HP02632, 



Homo sapiens 



Ab3S4S33 



AF233322 



l/elshmania 
major 



fljnino acid sequence of the " 
cDNA clone AIP-2 (HEB GM49) . 
possible adenylate kinase 



765 



265 



Mas musculus 



Y7ail3 



Homo sapiens 



zinc transporter like 2 



AL035539 



Arabidopais 
thaliana 



Human cytokine signal 
regulator CKSR-l SEQ ID 

NO:l. 



1916 



W74787 



Homo sapiens 



putative amino acid transport 
protein 



390 



■Ai;b3^689 



Homo 



Human secreted protein 
eyncoded by gene 68 clone 
HHFHN61, 



1171 



sapiens 



ciai8 7Jll,l {novel protein 
similar to protein kinase C 
inhibitors) 



34 



95 



99 



27 



100 



264 



AL0S0131 



TleEEan^acte" 
rium 

thettnoautotr 
ophicum 



serine/ threonine protein 
kinase related protein 



363 



AF019661 



Homo sap iens 
Mus musculus 



hypothetical protein 



AIi035S93 



AL022318 



Homo sapiens 



zeta proteasome chain; psmas 



Homo 



sapiens 



AF2 05940 
AL023S83 



Homo sapiens 



jj3 lOJ6.i (novel protein) 

"5KiS0C2.3 (PUTATIVE novel 

protein similar to APOBECl) 



1214 



821 



endomucin 



1289 



30 



100 



100 



Homo sapiens 
Homo sapiens 



dJ500Xil4.1 {novel proteinF 



~" — ~ • "» * — - f v^^^ii/ 

dJ1103G7 .3 [novel protein 
kinase domains containing 
protein similar- to 
phosphoprot e in C8FW) 



789 



1888 



99 



149 
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TABLE 2 



PCT/USOO/34263 



SEQ 
ID 
NO: 
268 


ACCESSION 
NUMBER 

AF161470 


SPECIES 
Homo sapiens 


DKSCRIPTION 
HSPC121 ~- 


SMITH- 
WATER14AN 
SCORE 
1884 


h 

J-UiUN 1 XTY 

98 — 


269 
270 

271 " 


AF161470 
X90763 

• AF207600 


Homo sapiens 
Homo 
sapiens 
Homo sapiens 


HSPC121 

HHaS hair keratin type I 
intermediate filament 
ethanoiaraine kinase 


1232 
2190 


96 

99 ^ 


272 
273 


M32334 
AF1614 83 


Homo sapiens 
Homo sapiens 


intercellular adHesion 

molecule 2 

HSPC134 


19S2 
1436 


100 . 

ibo 


274 


Y53052 


Homo sapiens 


Human secreted protein clone 
df202_^3 protein sequence SEQ 
ID NO: 110. 


663 
587 


61 ^ 

100" 


276 


Y77S76 


Homo sapiens 


Human cytoskeletal protein 
(HCYT) (clone 219S418) . 


762 


100 


277 


AP077042 


Homo sapiens 


30S ribosoraal protein S7 
homo log 


1269 


100 


278 


Y94907 


Homo sapiens 


Human secreted protein clone " 
cal06^19x protein sequence 
SEQ ID NO; 20, 


1619 


98 


279 


Y68788 


Homo sapiens 


Amino acid sequence ot a 
human phosphory 1 a t ion 
effector PHSP-20. 


2801 


*99 


280 


Z75134 


Canis ™* 
familiaris 


rod traneducin 


1816 


100 


281 

282 
283 


Z7S134 

AF249873 
AL050007 


Canis 

familiaris 
Hcxno sapiens 
Homo sapiens 


rod transducin " 

muscle-specific protein 
Hypothetical protein 


1718 

1395 
405 


100 


284 
285 
286 

287 
288 


AF201931 
AF156102 
Y3S897' — 

U88964 
AL050143 


Homo sapiex3S 
Hotno sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 


DCl 

Elib complex EAP30 subunit 
Extended human secreted " " 
protein seqpxence, SEQ ID NO. 
146. 

HEM4S ' _ ^ 

liypothetical protein 


1859 
1318 
^li^SO 

923 


99 
99 
99 

100 


289 
290 

291 


AuTOllOSS 
Y66724 

AF0i4801 


Homo sapiens 
Homo 
sapiens 
Homo sapiens 


telethonin 

Membrane -bound protein ~ " 

PR0836. 

Iiprin-alpha4 


598 
574 

2565 


100 
100 
100 

98 


292 
293 


AF034801 
Ab049851 " 


Homo sapiens 
Homo sapiens 


liprin-alpha4 

ciJ889»J22B.l ^ novel protein — 

(isoform 1) ) 


2590 
1738 


100 

i'oo 


294 
295 


y73348 
Lil672 


Homo sapiens 
Homo sapiens 


Witw cj.one tf^if&bi protein 
sequence . 

zinc finger protein 


1245 


99 


296 


AI*035423 


Homo sapiens 


aJ20l3.1 (brain mitochondrial" 
carrier protein- 1 (BMCPi) > 


1694 
1024 


44 
79 


297 
298 


API 98532 
AF161417 


Homo sapiens 
Homo sapiens 


lymphoid enhancer binding 
factor- 1 

HSPC299 " 


2173 


100 


299 


AF1S9141 


Homo sapiens 


breast cancer metastasis- 
suppressor 1 


1147 
"123^ 


85 
99 


300 


U26397 


Kactus 
norvegicus 


inositol polyphosphate 4- 
phosphatase 


160 


30 


301 
3 02 


AF036145 
Z82022 


Homo sapiens 
Homo sapiens 


meningioma -expressed antigen 
S 

GicNac-l-p transferase 


3458 


100 


303 


AF269232 


Mus musculus 


butyrophiiin-like protein 
BUTR-l 


206:^ 
271 


99 
50 


304 


AJ2 22644 


Arabidopsis 
thaliana 


<tsparaginyi-tRKA synthetase 


659 


SO 


305 


^^■054180 1 


Homo 
sapiens 


hematopoietic cell derived 
zinc finger protein' 


'351 


79 ■ 


306 " J 


iva;i72 079 ] 


4omo sapiens 


i\POBEC-l stimulating protein 


3056 


100 


308 


^44486 1 


^omo 

sapiens ] 


Kuman GPRW receptor 

polypeptide. 


1721 


100 

* 


309 2 


Auxoio^^x 1 Komo sapiens | DNA polymerase mu 


2598 


100 1 



ISO 



A 
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TABLE 2 



SEQ 
ID 
NO: 
310 


ACCESSION 
NUMBER 

AF29333S 


SPECIES 
Homo saplGns 


DESCRIPTION 

p30 DSC 


SMITH" 

SCORE 
1248 


IDENTITY 
92 


311 
312 

313 
314 


AF176525 
X570O2 

Z3671S 
AF161532 


Mus musculus 
Homo Baplens 

Homo sapiens 
Homo sapiens 


P-hox protein FBL.12 
immunoglobulin iambda light 
chain 
Net 

HSPC047 


150 1 
959 - 

2048 


81 
98 


31S 
316 


AF208068 
^66666 


Homo sapiens 

Homo 

sapiens 


kelch-like protein KLHL3a 
Membrane -bound protein 
PRO1013. 


727 
3046 


100 
100 
100 


317 
318 


Y29666 
AJ387747 


Homo sapiens 
Homo sapi ens 


Human Has protein hapr-i. 
sialin — ■ ■ ■ 


1253 


98 


319 


Af 161362 


Homo sapiens 


HSPC099 


2614 
" 224 


99 
40 


320 


y68773 


Homo sapiens 


Amino acxci secjuence of a 
human phosphoryla t Ion 


2243 


99 


321 
322 


AJ238379 
Afe640ei2 


Homo sapiens 
Homo sapiens 


putative THl protein 
protein kinase PAKS 


3013 


100 


323 


y9S013 


Homo sapiens 


Human secreted protein 
VC48_1. SEQ ID *NO:66, 


3792 
913 


99 
100 


324 


yi33ai 


Homo sapiens 


Amino acid sequence ot 
protein PR0271. 


19'>6 


100 


325 


¥94944 


Homo sapiens 


Human secreted protein clone 
^^1^"^.,!^ protein sequence 
SEQ ID KrO;94. 


2305 


98 


325 


Y768B4 


Homo sapiens 


Retinoblastoma binding 
protein- 7sequence . 


6728 


99 


327 


AF198S32 


Homo sapiens 


iymphoid enhancer binding 
factor-1 


2173 


lOO 


329 


Z78013 
AF212921 


Caenorixabdit 
is elegans 

Mus musculus 


Similarity to Drosophila 
Cadherin- related tumor 
suppressor 

MMTV receptor variant i 


569 


33 


330 


Z75330 


Homo 
sapiens] 

>K.O 3<U / 

MAR-1995 27- 

AUG-1993 

Human 

stfomalin-i . 
[Homo 
sapiens 


nuclear protein SA-l 


484 
6492 


94 

99 


331 


AL00B583 


Komo sapiens 


dJ327Ji6,3 (supported by 
GENSCAN, FC3ENBS and GBKI^VtSE) 


2133 


99 


332 


¥36104 


Homo sapiens 


Extended human secreted 
protein sequence > SEQ ID NO- 
489. 


310 


41 


333 


AJ271669 ■ • 


Homo sapiens 


putative sialoglycopro tease 


1747 


100 


334 


AF156599 


Kus musculus 


p53 -regulated DDA3 ' 


997 


64 


335 


M99058 


Eimeria 
maxima 


emlOO gene is homologous the ; 
Eimeria tenella gene etlOO ; 


154 


26 


336 


y85S64 


Homo sapiens 


Human horaologue of ONC-'53 
(Ha -UNC- 53 /I ) sequence . 


3386 


97 


337 


yBSS64 


Homo sapiens 


Human homologue of UNC-53 
(HS-DNC-S3/1) sequence . 


2602 


94 


338 


Y85564 


Homo sapiens 


Human homologue ot UNC-S3 
(HS-UNC-S3/1) sequence. 


3447 


98 


339 


Z66S61 


Caenorhabdit 
is elegans 


Similarity to Human rabl3 
protein (PIR Acc. No. 
A49647) , 


716 


34 


340 


AB021643 


Homo 
sapiens 


gonadotropin inducible 
transcription repre^ssor-3 


2761 


99 


341 


S01946 ] 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6027. 


465 


98 


342 J 

343 - 


^F020S91 ] 
[j29154 i 


•icmo sapiens 
iomo sapiens 


zinc finger protein 
imrnunoglobulin heavy chain 


1091 
439 - 


48 

84 - 



151 
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TABLE 2 



PCT/USOO/34263 



tD 
NO: 


ACCESSION 
NUMBER 


SPECIES 


VDJ region 


SMITH- 


% 

IDENTITY 


344 
345 


U10281 
AK000404 


Sus scrofa 
Homo sapiens 


gastric mucin 

unnamed protein product 


279 
1177 


24 
99 


34e 


1.22557 


Rattus 
norrvcgicus 


calmodulin- binding protein ' 


1949 


84 


347 


L22SS7 


Rattus 
norvegicus 


calmodulin-bj.nding protein 


2363 


yx 


348 


AIiQ4948l 


Arabidopsis 
thaliana 


AIGl-like protein 


"316 


30 * 


350 


AJ251S16 


Mus musculxis 


cysteine and histidine-rich 
protein 


1460 ' 


99 


351 


AK024477 ~ 


Homo sapiens 


FLaO0O7O protein 


1773 


ioo 


352 


U50133 


Homo sapiens 


ankyrin 


502 


33 


353 


AK00062S 


Homo sapiens 


unnamed protein product 


721 


100 


354 


AFl 6X420 




JliOf ^ J V 


2623 


97 


3S5 


Alio 10014 




n^bA proudn 


1269 


47 


355 


AF15102S 






941 


91 


3^7 


AL022327 


rioino sapxcns 


ouJSbClO.l tKIAA0027) 


1911 


100 


358 


V/78128 


Homo sapiens 


Human secreted protein 
encoded by gene 3 clone 

Uf^Ott "r n 

^058196 . 


1117 


100 


3S9 


X03414 


i^x.u±iupnjL ji,a 
melanogaster 


Kr polypeptide 


316 


45 


360 


AFiSi075? 


Homo s&plcns 


jior'Cjc4a 


643 


100 






Homo sapiens 


A suppressor of cytoJcine 
signalling protein 
cissxgnatea HSCOP-5 , 


530 


41 


262 


AF254741 


laelanogaster 


Centaur in Ganuna XA 


681 


46 


363 


AF213465 




duaX oxidase 


2016 


100 


364 


AFl 8 1562 


Homo sapiens 


proSAAS 


1319 


100 


365 


AFl 8 1562 






1024 


99 


366 


U73iOO 


Mus musculus 


pllGRip 


884 


62 


367 


AF263744 


Homo sapl ens 


erbb2- interacting^ protein 
ERBIN 


4973 


99 


36B 


U37501 


Mus musculus 


laminln alpha 5 cHain 


5867 


72 


369 


AF04369S 


Caenorhabdlt 
is 6ljs0anj5 


similar to the protein 


549 


36 


370 


Y73440 


Homo sap ions 


numcxij 9ec]7ev«eci pirocexn cxone 
V'i23 1 Dirot^in *a^nTi*=trkf*fi crt?A 
ID NO: 102. 


1484 


99 


371 


AF272833 


Homo sapiens 


misato 


2869 


97 


372 


AF198454 


Homo sapiens 


epltbeiial protein, lost in 
neoplasm beta 


3 92*7 


100 


373 


y73345 


Homo sapiens 


HTRM clone 436283 protein 
sequence . 


273 


0 J 


374 


AF169017 


Homo sapiens 


formlmxnotransf erase 
cycl odeaminase 


2717 '■ ■ 




375 


A95106 


unieienti£ied~ 


RED ALPHA 


1202 




376 


W74828 


Korao sapiens 


Human secreted protein 
encoded by gene 100 clone 
HLQAB52 . 


1012 


99 ~" 


377 


Y32131 


Homo sapiens 


Human LYST-2 protein. 


3556 


99 


378 


M14912 


Homo sapiens 


pol 


132 


86 


^if 


AFC90934 


Homo sapiens 


PRO0318 


3 82 


100 


380 


X66362 


Homo sapiens 


serine/ threonine protein 
kinase 


2499 


100 


381 


^41699 


Homo 
sapiens 


Human PR0703 protein 
sequence . 


2362 


100 


382 


AF17449B 


Homo sapiens 


GR AF-1 specific protein 
phosphatase 


7008 


98 


383 


U64608 


caenorhabdlt 
is elegans 


coded for by C. elegans cDNA 
ykl73cl2.5 


246 


36 


384 


US0133 


Homo sapiens 


ankyrin 


502 


33 


385 


AvJ23 8520 


Homo sapiens 


putative transcription 
factor- like nuclear regulator 


4123 


97 ♦ 
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TABLE 2 



PCT/USOO/34263 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAW 
SCORE 


% 

IDENTITY 


387 


AF20884S 


Homo sapiens 


BM-003 


1375 


99 


389 


XS7B21 


Homo sapiens 


imntunog lobu 1 i n X ambda light 
chain 


797 


76 


390 


AF182404 


Homo sapiens 


mi tochondrial uncoupl ing 
protein 1 


1670 


99 


391 


V8SS64 


Homo sapiens 


Human homologue of UNC-S3 
CHa-UNC-53/1) sequence. 


3386 


37 


393 


AF178432 


Homo sapiens 


SH3 protein 


3700 


100 


394 


AF229928 


Drosophila 
melanogaster 


cytoplasmic protein 89BC 


X616 


62 V 


39S 


AF181721 


Homo sapiens 


R02S 


2254 


100 


396 


Y69197 


Homo sapiens 


Amino acid sequence o£ a 
human betalV- spectrin 
protein. 


1626 


98 


397 


U4 8238 


Mus mus cuius 


zinc finger protein neuro-d4 


749 


60 


398 


AI4390137 


Homo sapiens 


hypothetical protein 


263 


51 


399 


AF217525 


Homo sapiens 


Down syndrome cell adhesion 
molecule 


5337 


60 


400 


AL022S99 


schizosaccha 

romyces 

pombe 


WD repeat protein 


447 


27 


401 


AGO 04 859 


Homo sapiens 


similar to 2-oxoglutarat:e 
dehydrogenase ; similar to 
002218 (PID:gl352618) 


4176 


78 


402 


AB010266 


Mus mus cuius 


tenascin-X 


10246" 


"62 


403 


AL133288 


Homo sapiens 


dJ671D7.1 (similar to 
D. melanogaster CGS986 
protein) 


761 


100 


404 


Z68753 


Caenorhabdit 
is elegans 


ZC518.3b 


886 


48 


40S 


Z78013 


Caenortiabdin 
is elegans 


Similarity to Drosophila 
Cadherin- related tumor 
suppressor 


S69 


33 


406 


AB031230 


Homo sapiens 


protein containing CXXC 
domain 2 


1196 


97 

S- 


407 


AF155106 


Homo sapiens 


NY-REN-36 antigen 


1168 


100 y 


408 


Y57945 


Homo sapiens 


Human transmembrane protein 
HTMPN-69. 


1S38 


99 


409 


Z18361 


Ovls aries 


trichohyaiin 


184 


30 


410 


AF249744 


Homo sapiens 


RlxoGEF 


2733 


100 


413. 


AF176529 


Mus musculus 


F-box protein PBX13 


2072 


94 


412 


AF210e42 


Homo sapiens 


HAkP 


4880 


100 


413 


AL031658 


Homo sapiens 


dJ3l0O13.7 (novel protein 
similar to H. rorctsi HRPET- 
3) 


776 


98 


414 


X5739e 


Homo sapiens 


pmS protein 


6131 


99 


415 


AB029826 


Home sapiens 

.. . 


3 -methyicrotonyl -coA 
carboxylase biotin-containing 
subunit 


2961 


99 




U43503 


S a c c haromy ce 
s ccirevisiae 


Lphlp 


115 


42 


417 


AItl60493 


Leishmania 
ma j or 


possible t26fl7-21 


239 


"35 


418 


Y08100 


Homo sapiens 


Human PR0331 protein. 


330 


29 


419 


uisi3i 


Homo sapiens 


pl26 


2228 


54 


420 


AF117946 


Homo yapiens 


Link guanine nucleotide 
exchange factor II 


2363 


100 


421 


AF19063S 


Drosophila 
melanogaster 


anJcyrin 2 


755 


30 


422 


AF302150 


Homo 
sapiens 


phosphoinositol 3 -phosphate- 
binding protein- 2 


1962 


100 


423 


AL137530 ' 


Homo sapiens 


hypothetical protein 


433 


94 


424 


X63753 


Homo sapiens 


Eon-a 


7269 


100 


425 


AB027249 


Homo sapiens 


MAPKK like protein Icinase 


1693 


100 


426 


AF279144 


Homo sapiens 


tumor endothelial .marker 7 
precursor | 


1084 
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SEQ 
ID 

KO: 


ACCESS ION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


IDENTXTy 


4 27 


AF279144 


Homo sapiens 


tumor endothelial marker 7 
precursor 


1259 


56 


428 
429 


AE003683 
y07829 


Drosophila 
melanogaster 
~ Homo sapiens 


CG8312 eene product 
RING linger protein 


149 


29 


430 
431 


AP096897 
(J41387 


Drosophila 
melanogaster 
Homo sapiens 


pushover 
Gu protein 


2201 
4442 

4021 


99 * 
47 

99 * 


4 32 
433 


AF146760 


Homo S3,piens 

Homo 

sapiens 


nephrocystin 

septin 2-lxke cell division 
control protein 


3783 
2284 


loo" 

100 


434 


AB006697 


Arabidopsis 
thaliana 


cleft lip and palate 
associated transmembrane 
protein-like 


886 


42 


437 


y94247 


Homo sapiens 


Human calcium binding protein 
hCBP. 


1704 


100 


438 


AB040672 


Homo eapiena 


UDP-GalNAc: polypeptide N- 
ace tylgalac tos ami nyl trans far a 
se 


1075 


63 


439 


AF105228 


Bos taurus 


tuftelin 


285 


33 


44 0 
441 


R06463 
X14971 


Homo sapiens 
Mu9 musculua 


Derived protein of clone 
ICA13 (ATCC 40SS3) , 
alpha-adaptin (A) (AA 1-977; 


3073 


99 


442 


X53773 


RaLLus 
norvegicus 


alpha- c large chain (AA i- 
938) 


4897 
3979 


98 
81 


443 


y66689 


Homo 
sapiens 


Membrane—lbound protein 
PR01136. 


3299 


99 


444 
445 


AC067754 
AF229032 


Arabidopsig 
thaliana 
Mus imisculus 


unknown protein; 20348-23707 
plli 


114 


33 


446 
447 


AF05603S 
AF1324a4 


RatCus 
norvegicua 
Mus musculus 


s-nexilin 

unknown ~" * — 


2077 
"26^2 


93 
85 


448 


Wa9024 


Homo sapiens 


Polypeptide fragment encoded 
by 9ene 156. 


4 78 
528 


51 
45 


449 


AF161445 


^omo sapiens 


HSPC327 




100 


4S0 


268753 


Caenorhabdit 
is elegans 


2CS18.3t>- 


9S1 


49 


4S1 


W39160 


Homo sapiens 


Human partial complement 
factor H protein fragment 3 . 


155 


3 2 


4S2 


WaS727 


Homo 
sapiens 


Novel protein (Clone 
BM46_10) . 


■2->9d 


99 


453 


YS3629 


Homo sapiens 


A Jbone marrow secreted 
protein deaignated BMSXlS. 


2810 


100 


454 


D87438 


Homo 
sapiens 


Similar to a C.elegans 
protein in cosmid C14H10 


4069 


100 


455 


AF2404€d 


Homo sapiens 


nicastrin 


3687 


100 


456 


Z15005 


Homo sapiens 


CENP-E 


13305 


99 


457 


M59216 


Homo 
sapiens 


gamma-aminobutyric acid 
receptor beta-1 subunit 


2477 


100 


458 


y73467 


Homo sapiens 


Human secreted protein clone 
yd6l_i protein sequence SEQ 


'966 


100 


4S9 


W^7824 


Homo sapiens 


Human secreted protein 
encoded by gene IB clone 
HSLPM29 . 


S3S 


100 


460 


AFi63151 


Homo sapiens 


dentin sialophosphoprotein 
precursor \ 


279 


19 


461 


D87446 


Homo sapiens 


similar to a C.elegans 
protein encoded in cosmid 
C27F2 (U40419) 


9196 


99 


462 


G04Q44 


Homo sapiens 


Human secreted protein, seq 
ID NO: 812S. 


486 


93 


463 


ftC002398 


Morao sapiens 


F2S965 1 


1018 


100 


464 


^F064856 


Kattus sp , 


7acomp protein 


1845 




465 


FVF223408 1 


tiomo sapiens 


B99 - - 


3686 


39 



154 



•JS., .;-:0 <WO 0153312A1 t > 



wo 01/53312 



PCT/USOO/34263 



TABLE 2 



SEQ 
ID 
NO * 


ACCESSION" 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


466 


AF223406 


Homo sapiens 


B99 


2878 


87 


467 




Mua tnusculujs 


gene trap locus- 13 


6336 


91 


466 




U53450 


Rattus 
norvecficus 


Jun dimeriaation protein 1 

*JDP- 1 


196 


49 


469 




Homo sapiens 


aiJ9/F20-l I novel gene) 


3564 


99 




>if t\}/f 


Homo SoipxQiis 


eulcaryotic translation 
initiation factor EXF2B 
subunit 3 


1274 


95 




Xj2612 5 


Podospojra 
anserina 


l>eta transducin-like prone In 


284 


38 V 


4 72 


1 o4yU J 


Homo sapiens 


A human proliferation and 
apoptosis related protein. 


2337 


100 


4 Y3 




Homo sapiens 


LOMP protein 


252 


44 


474 


Y71213 


Homo sapiens 


Human irritable bowel disease 
related polypeptide IMX3 9, 


838 


100 


475 


Y95006 


Homo sapiens 


Human secreted protein 
vel3_l, SEQ ID NO:52. 


3411 


100 


476 


1)38549 


Homo sapiens 


hal025 is new 


6533 


99 


477 


AF241230 


Homo sapiens 


TAKl -binding protein 2 


3656 


100 


478 


AIj031S34 


Schizosaccha 

romyces 

pombe 


putative asparagine synthase 


482 


40 


479 


Ii2812S 


Podospora 
anserina 


beta tremsducin-like protein 


233 


26 


480 


AF161544 


Homo sapiens 


HSPC059 


434 


77 


481 


AiJ23824 8 


Homo sapiens 


centaurin beta2 


3986 


99 


482 


Z38061 


Saccharomyce 
0 cerevisiae 


malS, stal, len: 1367, CAI : 
0.3, AMYH_YEAST P08640 
GtiUCOAMYLASB SI (EC 3.2.1.3} 


295 


23 


483 


AF161381 


Homo sapiens 


HSPC263 


1404 


100 


484 


AF223468 


Homo sapiens 


AD021 protein 


1314 


100 


486 


XS7527 


Homo sapiens 


alpha KVIII) collagen 


4166 


99 


487 


Y190S2 


Homo sapiens 


39k3 protein 


2475 


100 


488 


y73373 


Homo sapiens 


HTKM clone 921803 protein 
sequence . 


555 


S6 


489 


AL021918 


Homo 
sapiens 


i>34I8,i (Kruppel related Zinc 
Finger protein 184) 


4184 


100 


4S0 


X53773 


Rattue 
norvegicus 


alpha- c large chain (AA 1- 
938) 


4675 


97 


491 


US2426 


Homo -sapiens 


GOK 


1459 


S9 


492 


AIj3S9773 


Leishmania 
major 


possible threonine synthase 


702 


45 


493 


AF226614 


Homo sapiens 


ferroportinl 


2929 


XOO 


494 


Z93241 


Homo sapiens 


d(J222E13 . 1 (novel protein 
with some similarity to 

l^rOSOpjllia KKAKJSNJ 


513 


96 


495 


w ^ u ^ f i 




unlcnown 


1812 


100 


496 


U93564 




p40 


133 


45 


497 


Y91405 


HoiTio sapiens 


Human secreted protein 
SEQ ID NO: 126. 


357 


100 


498 


AF069781 


Drosophila 
melanogaster 


Bem46-li;ce protein 


653 


43 


499 


Y16601 


Homo sapiens 


Human cell -cycle 
phosphoprotein CECVP-2. 


1658 


98 


SOO 


X70944 


Homo sapiens 


PTB- associated splicing 
factor 


3883 


100 


SOI 


AF027503 


Mus 

mueculus 


putative membrane- associated 
guanylate kinase 1 


205 


36 


502 


AF262874 


Homo sapiens 


nectin 3; PRR3 


2856 


99 


503 


AJ249732 


Homo sapiens 


G8 protein 


669 


100 


504 


AF208861 


Homo sapiens 


BM-019 


1629 


100 


505 


L09708 


Homo sapiens 


complement component C2 


4022 


100 


507 


X66285 


Kus mus cuius 


HCl ORF 


115 


43 


508 


000189 


Rattus 
norvegicus 


Ka+,K+-ATPase alpha-subunit 


5227 
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SEQ 
ID 
NO : 
509 


ACCESS 10 
NUMBER 


^ SPECIES 


DESCRIPTION 


£>MITH- 
SCORE 


5 

IDENTITY 


510 
511 
OXa 


Y94971 

ABO19038 
AB019038 
AB01903 8 


Homo sap i en 

Homo sapiens 
Homo sapieni 
Horao sapiem 


s Human secreted protein clone 
^^^"^^-^ protein sequence SEO 
ID NO:148. 

3 beta-1,4 mannosyl transferase ■ 
* beta-l,q tnannosyltransf erase 
> beta-1,4 mannosyltransferase ' 


2176 

781 
1347 


100 

77 
100 


513 
514 
515 

516 


X84908 
X528S1 
AF186084 

GO 3 602 


Homo sapiens 
Homo sapiens 
Homo 
sar5i«*nQ 


I phosphoryiase kinase 
pep c Idyl prolyl isomerase 
epi<iermal growth factor 
repeat containing protein 


■"1520 
5729 
650 
3046 


99 ~ ■- , 
99 ' 

76^ 1 


517 


U04706 


Homo sapiens 
Bos taurus 


Human secreted protein, SEQ 
XD WO: 7683 . 
50 JcDa protein 


505 


99 


518 
519 


G00653 
AF16147S 


Homo sapiens 
^omo sapiens 


Human secreted protein, SEQ 
ID WO: 4734, 

H£jfC12b ~ — . 


1749 
530 


77 1 
100 


520 
521 


y99366 
AF266852 


Homo sapiens 
Homo sapiens 


Human PR01475 1UNQ746) amino 
acid sequence SEQ ID WO: 98. 
PTPIA ~ 


1368 
3394 


100 1 
97 I 


522 
523 "" 




Archaeocfiobu 
a fulgidus 


Chromosome segregation ~ 

protein (smcl) 


1295 

15 i3 


100 — 1 
20 


524 ■ 




Homo sapiens 


'immunoglobulin heavy chain 

i variable region 


60S 


97 


525 


ALr223B30 
W01S35 


Kattus 
norvegicus 


AREl 


2950 


98 


526 


AFl456se 


nomo sapiens 


Cellular homologue of the 
SV40 large T antigen. 


1276 


83 


527 


AF112213 "■ 


urosophila 
meXanogaeter 


BCDNA.GH10229 


320 


0 J I 


528 


D49387 


Homo sapiens 


putative KahS- interacting 

protein 


524 


79 1 


529 


y30819 


Homo 
sapiens 


NADP dependent leukotriene b4 
12-hydroxydehydrogenaae 


"i^i^ 


100 — 1 


530 


AL079335 


Homo sapiens 


Human secreted protein 
encoded froci gene 9. 


328 


32 ■ 1 


531 


Y91506 


Homo sapiens 


dJl32F21.3 <72.l KDa protein 
(DKFZP5ff4A032, SBBI88) 
siiailar to mouse IFN-gamma 
induce MGll. ) 


"1059 - 




532 


X76116 


Homo sap X ens 


Human secreted protein 
sequence encoded by gene 56 
SEO ID NO: 179. 


1159 


98 


533 


X76116 


Caenorhabdit 
is elegans 


carrier protein (c2) 


576 


50 j 


534 


X12966 


CaenorhaDdit 
is elegans 


carrier protein (c2) ~~ 


505 


50 H 


53S 


^09267 


Homo sapiens 


•J w/w»>a^y X ^ ^0/\ CfxlOiase 

propeptide (424 AA.) 


1972 


ioo H 


536 
537 
538 


^11773 
1384224 

□84224 ] 


Homo sapiens 

Komo sapiens 
Homo sapiens 
iomo sapiens i 


t'iavm-containing 
monooxygenasG 2 

SRE-ZBP ~"~ — — 

tnethaonyl tRKA synthetase 


'2486 

2201 
4741 


100 "j 
99 1 

99 ) 


539 

540 I 

541 , 

542 > 


384224 " 1 
103244" I 

r92514 1 


Eiomo sapiens i 
iomo sapiens t 
3oe taurus j 

[omo sapiens It 


tiethionyl tRNA synthetase 

tiethionyl tRNA synthetase 

nethionyl tRNA synthetase 

i+ ATE»ase 31kDa subunit (ElC ' 

i.6.1.3) 

iuman OXRE-ix. 


3887 
2933 
4529 
B48 


99 1 
'96 "j 
99 

77 - H 


543 / 

544 p. 


^221712 I 
£ 


iomo 5 
apiens 2 


»mad- and 01 f- interacting " : 
:inc fingeJT protein 


2151 ( 


^9 1 

51 


545 A 


LE000919 N 
X 

t 

0 

06669 B 


ethanobacte c 
ium 

hermoautotr 
phi cum 


onsetveci protein ^ 


507 ; 






C 


ynthetic p 
onstruct 


reTtiF-betal "2 

— L 


070 5 
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TABLE 2 



SEQ 
ID 
MO: 



546 



546 



54 9 



sso 



552 



S54 



555 



ACCESSION 
NUMBER 



SPECIES 



Y02698 



I Homo sapiens 



AF1I2205 [ Homo sapiens 



DESCRIPTION 



Human secreted protein 
encoded by gene 49 clone 
HTPCS60 . 



X60271 



Mus tnusculus 



AC016e27 



I Arabldopsis 
thallana 



Y70400 



AB04e36S 
Y57880 



Homo 
sapiens 



Homo sapiens 



Homo sapiens 



AF119BSS j Homo sapiens 



m7236 



AL07846e 



Homo sapiens 



I Arabidopsis 
t ha liana 



AC006963 Homo sapiens 



WSB-l protein 



c-rel 



putative GTPase 



Human cell- signalling 
protein- 2 . 



WEDD4-like ublcuitin ligase 1 



Human transmembrane protein 
HTMPN-4. 



PR01847 



MHC HIiA-JDQ alpha precursor 



putative protein 



similar to Kelch proteins; 
similar to BAA77027 
{PlD:g46SG844) 



SMITH - 
WATERMAN 
SCORE 



854 



2275 



2264 
"810 



429 



8290 



1112 



265 



1332 



540 



SIS 



IDENTITY 



98 



100 



74 



42 



66 



100 



•559 



M12140 



Homo sapiens 



W74 825 



Homo sapiens 



FLJ00086 protein 



Homo sapiens 



pol gene protein; Xxx 



1623 



Human secreted protein 
encoded by gene 97 clone 
HAQBP73 . 



225 



98 
48 



56 



560 
561 



X56S81 
AF003156 



Homo sapiens 



junP protein 

contains weak similarity to 



373 



Caenorhabdit 
is elegans 



an AMP-binding tnotif 



54 



I Homo 



sapiens 



dJ10€9P2.3,l (novel PABPCl 
(poly <A) -binding protein) 
BCDNA.GH09817 



100 



565 



56g 



567 



569 



570 



571 



S72 
573 



575 



AF052723 



Drosophila 
melanogaster 



289 



Feline 
leukemia 
virus 



gag-pol precursor polyprotein 
gPrSO 



AF161472 I Homo sapiens" 



y28B17 



HSPC123 



U09848 



Homo sapiens 



AF155113 



Homo sapiens 



Pt326_4 secreted proteinT 



439 



Homo 



zinc finger protein 



3338 



AF15^113 



sapiens 
sapiens 



NY-REN-55 antigen 



1738 



3603 



ALQ32821 | Homo aapiens 



NY-REW-SS antigen 



M69181 



M69181 



Homo 
Homo 



dJ5SC23.1 (vanin 1) 



3951 



sapiens 



non- muscle myosin B 



1821 



Homo 



sapiens 
sapiens 



non- muscle myosin B 



Secreted protein 108-008-5-0- 
E6-PL. 



AL36S234 



Arabidopsis 
thaliana 



putative protein 



42 



AT 



44 



100 



100 



93 



99 



98 



99 
98 



lOO" 



Arabidops 



psis 
thaliana 
Homo sapiens 



putative protein 



40 



578 
STT 



581 



AB041642 



DNA polymerase alpha -subunit 
(AA 1 - 1462} 



7619 



D86984 



Homo sapiens 
Homo sapiens 



PAR- 6 



AF165124 Homo sapiens 



Similar to yeast adenylate 
cyclase (SS6776) 



1342 



2446 



W8e8l2 



Homo sapiens 



gamma -aminobutyric acid A 
receptor gamma 2 



Polypeptide 
by gene SS . 



fragment encoded 



2499 



2339 



99 



100 



99 



99 



582 
583 



584 



585 



082319 
P92219 



AJ22394S 



Y08612 



Homo sapi ens 
Homo sapiens 
(human) 

Homo sapiens" 



novel ORF 



342 



CRl protein. 



11425 



RNA helicase 



Homo sapiens 



Y42384 



SBkOa nuclear pore complex 
rotein 



6608 



3874 



Homo 
sapiens 



AFl^y /bfe ) Homo sapiens 



Amino acid sequence of 
lv310 7. 



BAT4 



100 



37 



98 — r 
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TABLE 2 
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SEQ 
ID 
NO: 
588 


NUMBER 
AK131775 


SPECIES 
Homo sapiens 


DESCRIPTION 

Unknown 


SMITH- 
WATERMAN 
SCORE 
1929 


IDEKTITY 
99 


589 
591 


Aa2S086S 
298865 


Homo sapiens 
Homo sapiens 


TESS 2 

dJ522 J7 . 2 {bromodomain- 


2348 
4167 


100 
100 








containing l (similar to 


592 
593 


L76 571 
AF091622 


Homo sapiens 
Homo sapiens 


peregrin, BR140) ) 
nuclear hormone receptor 
PHD finger protein 3 


1355 
90S4 


100 

100 - 


594 
595 
596 

S97 


XS6807 

Abl37802 

AL022329 

AF226048 


Homo sapiens 
Homo 
sapiens 
Homo sapiens 


cU798Aio.l (novel protein) 
bK407Fll.2 (adrenergic, beta, 
receptor kinase 2) 
GL003 "- 


4443 

-il2 

3653 


100- 

55 

100 


598 


AvJ278112 


" jf c * ii» J 

>Y49635 

Y4963S 21- 
OCT-1999 IS- 
APR- 19 9 8 
Human sdp3 . 5 
protein. 

[Homo 
sapiens 


putative cell cycle control 
protein 


2009 


99 
23 


599 
600 


Y59741 
L36531 


Homo sapiens 
Homo sapiens 


Human normal ovarian tissue 
derived o jrrti- t n in 
integrin alpha 8 subunit 


1574 


^9 


601 


Y38458 


Homo sapiens 


Human secreted protein 


5386 
895 


99 
100 


602 
^03 


AF218584 


Homo sapiens 


GGAl 


3265 


100 


604 


Y13115 
AL132776 ^ 


Homo sapiens 
Homo sapiens 


serine /threonine protein 
kinase 

dJ393D12.1 <KIAA0776) 


5071 


99 


60S 
^66 


AL034452 
Y14494 


Homo sapiens 
Homo sapiens 


dJ682Jl5.l (novel Collagen 
triple helix repeat 
containing protein) 
axralarl 


2413 
1979 


99 
100 


607 


AJ001981 


Homo sapiens 


OXAIL 


34^5 
2603 


ioo 


608 


X86098 


Homo 
sapiens 


bxnde directly to adenovirus 
type 5 ElA protein 


3069 


100 


610 
611 


AFX63 572 
AF161503 


Homo sapiens 
Homo sapiens 


Forssman glycol ipid 
synthetase 

HSPC154 ■ 


1865 
1261 


99 
97 


612 
613 


1*41834 
Y919S4" 


Ensis minor 


nuclear protein 

Human cytosJceleton associated 

protein 9 (CYSKP-9) . 


345 
3668 


30 
100 


614 
615 


AL022327 
X8S786 


Homo sapiens 
Homo sapiens 


dJ355C18.1 (KIAA0027) 
binding regulatory factor 


361 


94 


616 
617 


Y08319 
Di2644 


Homo sapiens 
Mus musculus 


kinesin-2 
KiF2 protein 


3203 
3487 


ibo 

99 


618 


U28'>a9 


Mus musculus 


PACT 


3609 
5936 


97 — 
89 


619 


Y35914 


Homo sapiens 


Extended human secreted 


X684 


93 ""■ 








protein sequence, SEQ ID NO. 
163. 




620 


AS0463B2 


Mus musculus 


testis-abundant iinger 


199 


23 








protein 


621 
622 


y60062 
AF068286 


Homo sapiens 
Homo sapiens^" 


precursor polypeptide (AA -23 

to 1120) 

HDCMD38P 


3440 


99 


623 


X98248 


Homo sapiens 


sortiiin 


861 
4436 


"iob " 

99 


624 


X6110Q 


Homo s an i ens 


75 JcDa subunit NADH 
dehydrogenase precursor 


3734 


99 


625 


S58544 


Homo sapiens 


75 Jcda infertility-related 
sperm protein 


2125 


99 


626 


AF1S1027 


Homo sapiens 


HSPC193 


582 


93 


627 


X14 968 


Homo sapiens 


RI I -alpha subunit (AA 1-404) 


2079 


100 


628 


Y50911 


Homo sapiens 


Human fetal brain cDNA clone 
vb7_l derived protein 


1983 


100 f 
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TABLE 2 



SEQ 
ID 
NO: 



629 



633 



638 



633 



640 
641 



642 



645- 



649 



650 



ACCESS ION 
NUMBER 



y SO BTT 



AF0987a6 



AL03455S 



W74826 



SPECIES 



DESCRIPTION 



Homo sapiens 



Homo 
sapiens 



Human fetal brain cDNA clone" 
yb7 1 derived pro tein 
17 beta -hydroxys Ceroid 



Homo 
sapiens 



dehydrogenase type VI i 
1 CIJ134019.3 (zinc finger 



I protein ISI fpH2-67) ) 



AF288288 



X663S7 



yil284 



AB004e84 



AJ0.02303 



Aa7002304 
AJ0023O3 



P87682 



M14660 



X0g661 
AF119900 



AB03104a 



AF2S0842 



Homo sapiens | Human secreted protein 

encoded by gene 9B clone 
HAQBT94 - 



Homo sapiens | HPT protein" 



Homo sapiens | pRGfel " 



Homo sapiens 



serine/ threonine protein 
kinase 



SMITH- 
WATERMAN 
SCORE 



1694 



1754 



4273 



794 



223?" 
823 



Homo sapiens 
Homo sapiens 



AF^l 

1*10;- alpha" 



Homo sapiens synaptogyrin ic 



Homo sapiens i synaptogyrln lb 



Homo sapiens 



Homo sapiens 



Homo sapiens | ISG-K54 



synaptogyrln ic 
similar to a C elegans 
protein encoded in cosmid 
' T26A5. 



1S89 



"257T" 
3718^ 
1020 



1002 



933 
5676 



Homo sapiens | calbindin (AA 1-261) 



Homo sapiens | PR0282^ 



urosophila 
melanogaster 



microtubule aseociated- 
protein orbit 



U67934 



AF236061 



AIi034553 



DrosophiXa multiple asters 
meleu^ogaster 

Homo sapiens | Mi-2 protexn" 



Homo sapiens 



Qryctolagus 
cuni cuius 
Homo sapiens 



44.9 kDa protein ClSBll 
homolog 



RING-finger binding protein 



dJ9i4P20.2 (KIAA0784 protein 
similar to Mus musculus 
activity-dependent 
neuroprotective protein 
(Adnp) ) 



2473 
1358 



738 



iOllG 



827 



3830 
5708 



IDENTITY 



100 



lOO 



96 



100 



99 



100 



98 



99 



100 



94 



100 



99 



100 
"76— 



27 



29 



99 



96 



91 



xuo 



Homo sapiens 



GABA-A receptor alpha l 
subunit 



2388 



99 



656 



658 



659 



60 
661 



y57908 



Homo sapiens similar to f-spondin proteins 
I AB0Q6086 (PID :g2S2922S) 



3026 



Homo sapiens 



Z34975 



Human transmembrane protein 
HTMPN-32. 



AIiOS0306 



Homo sapiens 



W76734 



Homo sap i ens 



AF202724 
Z21966 



Homo 
sapiens 



IdiCp 

dOr475B7.2 (novel protein)' 



3733 



I Human moia Rho targeting 
I protein 



1942 



Homo sapiens Sadl imc»64 domain prot^IH^ 



2172 



99 



100 



99- 



34 



662 
663" 



667 



669 



AJ242954 
AF182316 



XS9303 



yi33SS 



AB010692 



sapiens 

Mus musculus 



mPQU homeobox protein 



Homo sapi ens 
Ar abi dop sis" 



dysEerlTrT" 



thaliana 



myo£erlln" 
hypothetical protein 



1529 



Homo sapiens 
Homo sapiens" 



valyl-tRWA synthetase 
Amino acid sequence of~ 



Arabidopais 
thaliana 



I protexn PR0220, 



contains similarity to endo- 

beta-N-acetylglucosaminidase 
gene 



6232 



209 



3393 



3^92" 



611 



100 



59 
99 



30 



99 



100 



S2 



671 
672 



673 



674 



675 



AB039371 



Kus muscu!ius | taiifi" 



Homo sapiens 



AF269223 



mitochondrial abc transporter" 



4474 



2902 



AF229633 



Homo sapiens | TCPll 



1*14463 



Mus musculus 



Rat t us 



grouchO" related p rotein 4 
' transcLuciii " " 



806 



4053 
3619 



159 



76 



99 



99 



92 
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TABLE 2 



PCT/USOO/34263 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERPIAN 
SCORE 


5f 

IDENTITY 






norvegicus 








676 


ACQ 05757 


Homo sapiens 


R32611_l 


2779 


100 


677 


S61069 


Homo sapiens 


reverse transcriptase 
homolog=pol { retroviral 
element ) 


252 


65 


678 


AF27138S 


Homo sapiens 


CMP-N-acetylneuraminic acid 
synthase 


2273 


100 


679 


X79066 


Homo aapiena 


ERF-l 


1783 


100 * 


680 


AF118566 


Mus rausculus 


hematopoietic zinc finger 
protein 


769 


50 ' 


681 


Y51415 


Homo 
sapiens 


Human wild type pKe83 
protein. 




99 


682 


AL133S4S 


Homo sapiens 


bA386M14.1 (novel protein 
similar to a dual specificity 
phosphatase ) 


700 


68 


683 


Y86214 


Homo sapiens 


Nuclear transport procein 
clone h£b341 protein 
sequence . 


5888 


99 


684 


Y94952 


Homo sapiens 


Humiein secreted protein clone 
fhll6__ll protein sequence 
SEQ ID NO: 110. 


354 


98 


685 


AIi021878 


Homo sapiens 


dJ257I20 , 4 ( transcription 
factor 20 (ARl) (KIAA0292) 
(isoform 2) ) 


154 


67 


686 


AB000198 


Escherichia 
coli 


orf, hypothetical procein 


628 


100 


687 


MS8378 


Homo sapiens 


synapsln I 


3730 


99 


68S 


AF03 9697 


Homo sapiens 


antigen NY-CO- 31 


508 


98 


689 


U09355 


Oryctolagus 
cuni cuius 


protein phosphatase 2A1 B 
gamma aubunit 


2356 


99 


690 


AF1S5106 


Homo sapiens 


NY-REN-36 antigen 


265 


SO 


691 


AC004774 


Homo sapiens 


DlX-S 


1542 


100 


692 


X90S30 


Homo sapiens 


ragB 


X926 


99 


693 


X90S30 


Homo sapiens 


ragB 


140S 


99 


694 


X90S30 


Homo sapiens 


ragB 


1590 


85 H 


695 


G01563 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5644, 


330 


100 


696 


ACOilSlO 


Arai?idopsis 
tha liana 


Putative methionine 
aminopep t Ida s e 


669 


52 


697 


AJ2S0425 


Rattus 
norvegicus 


Collybistin I 


2455 


98 


698 


AB037901 


Homo 
sapiens 


gene amplified in squamous 
cell carcinoma -1 


5364 


99 


699 


Y99401 


Homo sapiens 


Human PR01327 (UNQ687J amino 
acid sequence SEQ ID N0:2J8- 


1386 


100 


701 


AF221712 


Homo 
sapiens 


Smad- and Olf -interacting 
zinc finger protein 


,, , , 
6705 


100 


702 


X83573 


Homo sapiens 


ARSE 


3184 


99 






Homo sdpx^ns 


AP**2rep protein 


2078 


99 


704 


Y7a262 


Homo sapiens 


Human chondr omodul in - 1 iJce 
protein, Zchml , 


1697 


94 


705 


Y71262 


homo siaiTilpnp: 


protein, Zchtnl. 


1736 


...... 


706 


Y412S7 


Homo sapiens 


Amino acid sequence ot long 
human FAIM. 


1060 


100 


707 


ALQ22237 


Homo sapiens 


bK119lB2.3 (PUTATIVE novel 
Acyl Transferase similar to 
C. elegans C50D2,7) (isoform 
D) 


2030 


100 


708 


AJ006266 


Homo sapiens 


AWD-l protein 


5942 


100 


709 


G01S71 


Homo sapiens 


Human secreted protein ^ SEQ 
ID NO: 5652, 


777 


99 


710 


Y0a698 


Homo sapiens 


ranbp3 


2849 


98 


711 


Y68770 


Homo sapiens 


Amino acid sequence ot a 
human phosphory 1 a t ion 
effector PHSP-2 . 


754 


99 



160 
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TABLE 2 



SEQ 
ID 
NO: 


NnMBER 






SMITH— 
WATERMAN 
SCORE 


IDEJOTTY 


712 


U93574 


Homo sapiens 


putative pl50 


799 


59 


713 


AC004531 


Homo sapiens 


Gene with similaity to DEAD 
box hel leases 


2715 


99 


714 


D89016 


Homo sapiens 


Neurob 1 a s t oma 


S38 


48 


715 




Homo sap lens 


Human cardiovascular system 
associated protein tyrosine 
pnospnawASc ^ • 


734 


98 


"71 c 




— T ' 

Homo sapiens 


i.j.vo . J \projL>aDxe u]rac.j.j. 

t* VATI f"^* i*?* \ 


862 


100 ' 


717 


AB035123 


Kus raus cuius 


GDI alpha/GTla alpha /GQlb 
alpha synthase 


1696 


93 


718 


Y96290 


Homo >P402S4 
P402S4 25- 
OCT-1984 09- 
APR-1983 
Human IgP, 
(Homo 


Human rGFAM-2 immunoglobulin. 


2345 


85 


719 


X07979 


Homo sapiens 


integrin beta 1 subunit 
precursor 


4347 


99 


720 


AJ224819 


Homo sapiens 


tumor suppressor 


2149 


99 


721 


Y07S95 


Homo sapiens 


transcription factor TFIIH 


2373 


100 


722 


W41565 


Homo 

sapiens) 

>W41564 

W41S64 08- 

OCT-X997 05- 

APR-1996 

Human 

calpain. 

[Homo 

sapiens 


Human calpain. 


IS91 


99 


723 


7^161341 


Homo sapiens 


HSPC078 


1097 


98 


724 


AF1B7318 


Homo sapiens 


F-box protein Fbx2 


1607 


100 


725 


ACa06708 


Caenorhabdi t 
Is elegans 


contains simlarity to 
Saccharomyces cerevlslae pre- 
mRNA splicing protein PRP31 
(GB:27287€) 


1143 


46 


72 6 


7\ ftn f\ /Z '~I r\ n 

AC-UUb /Oo 


Caenorhabdi t 
is elegans 


contains oxmlarity to 
Saccharomyces cerevisiae pre- 

(GB:Z72876) 


988 


45 


727 


AC024B18 

W A- N-* J- 1-> 


is elegans 


family PF00400 (WD domain, 
G-beta repeat) r score»«8l-8, 
E«1.4e-20, N«i3 


QSQ 


44 


72 8 


AJ00S897 


Homo sapiens 


JMS 


831 


47 


729 


Y45377 


Homo sapiens 


Human secreted protein 
fragment encoded from gene 
27. 


908 


97 


730 


G03931 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8012. 


578 


100 


731 


AB012720 


Oncorhynchus 
ma sou 


GTP-binding protein 


3865 


76 


732 


W73404 


Homo sapiens 


Human secreted protein 
encoded by Gene No . 8 . 


862 


97 


•3^3 


G02650 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6731. 


644 


97 


734 


AC024 813 


Caenorhabdi t 
is elegans 


Hypothetical protein 
YS4F10AL.a 


152 


24 


735 


AL.035461 


Homo sapiens 


dJ967N2l.6 {novel CDP- alcohol 
phosphatidyl transferase 
family member protein) 


1562 


98 


736 


UO0Q33 


Caenorhabdi t 
is elegans 


similar to S. cerevisiae YJU2 
protein 


605 


41 


737 


AF079098 


Homo 
sapiens 


arginlne- tRNA-protein 
transferase 1-lp; ATEl-lp 


2733 





161 
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TABLE 2 



SEQ 
ID 
NO: 
730 


ACCESSION 
NUMBER 

AJ131712 


SPECIES 
Homo sapiens 


nucleolar RNA-helicase 


SMITH- 
WATERMAN 
SCORE) 


IDENTITY 


739 


AJ133115 


Homo s^pi.6iis 


TSC"22~like protein " ™" 


2793 
20S4 


xoo 

99 


740 


X98258 


ifiotno sapiens 


M- phase phosphoprotein 9 


953 


iOO 


741 


X98258 " 


Homo sapzens 


M— pnase phosphoprotein 9 


564 


/4 


742 


U97191 


Caenorhabdl t 
is elegans 


strong similaritiy to the YPTl 
sub- family of RAS proteins 


you 


85 - 


743 


X76057 


Homo sapiens 




2i9l 


100 


744 


G0320S 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7290. 


496 


98 * 


74S 


X97064 


Homo sapisns 


ACvr<«o proc6 2.xi 


4034 


99 ■ 


746 


W93946 




Human regulatory molecule 


994 


100 


747 


y733ea 


Hocno sapi6tis 


iiiiun wxone <.u/d4u4 procem 
secjuence • 


1565 


99 


748 


M19S29 


Sus scrofa 


follistatin A 


1906 


98 


749 


Av724 9457 


T ]^ i c liom o n a s 


centring putative 


183 


28 






* » XQ 






750 


AC004410 


Homo sapiens 


f0339SS4 1 


2094 


100 


751 


AF074968 ' 


Homo sapiens 


p4 /ifJGi protein 


2167 


100 


752 


AF252284 


Homo sapiens 


transcription specificity 
factor Spl 


4005 


100 


753 


AB049629 


Homo sapiens 


phosphol ys ine 

phosphohistidine inorganic 
pyrophosphate phosphatase 


1375 


99 


754 


D79205 


Homo sapiens 


ribosomal protein L39 


160 


"71 


7SS 


AB066430 


Homo sapiens 


CDEP 


142 


'29 


758 


Ii32162 


Homo sapiens 


transcription factor 


574 


80 


759 




Homo sapiens 


RING zinc finger protein 


295 


S4 


760 


Y442S0 


sapiens 


Human cell signalling 
protein- 13 . 


625 


100 


761 


AF218586 


Homo sapiens 


Cide-b 


1136 


100 


762 


U38934 


Gallus 


his tone H2A 


625 


97 


763 


AF226053 


nama Sapx^riS 




606 


32 


764 


Xi3403 




Oct-1 protein (aa 1 - 743) 


3626 


100 


765 


D87446 


Homo sapiens 


Similar tc a Celegans 
protein encoded in cosmid 
C27F2 {U40419) 


568 


38 


766 


AIi023828 


is cl^gans 




200 


27 


767 


Y82777 


Homo sapiens 


i»J»i»JJ.W.Ali i^V^ACAWU' JtPJLWUCJLn 

(Clone dw665_4) . 


- 

2551 


99 


768 


X92475 


Homo s ap i ens 


ITBAl 


1429 


100 


765 


Y42752 


Homo sapiens 


3 (CaBP-3). 


1426 


100 


770 


X51416 


Homo sapiens 


hormone receptor hERRl (AA 1- 
521) 


i641 ■ 


97 


771 


AJ006531 


Homo sapiens 


cysteine-rich protein 


1793 


100 


772 


A08695 


Homo sapiens 


rap2 


935 




773 


Z12173 


Homo sapiens 


N-acetylglucosamine - 6 - 
sulphatase 


2970 


100 


774 


Y919S0 


Homo sapiens 


Human cytoskeleton associated 
proteins (CYSKP-S) . 


565 


43 


776 


AL023799 


Homo sapiens 


aJ322P7,l (zinc finger) 


655 


S6 


777 


AL023799 


Homo sapiens 


dJ322P7,l (zinc finger) 


8^5 


56 


778 


G01880 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: S961. 


64 9 


98 


779 


AJ012S90 


Homo sapiens 


glucose l- dehydrogenase 


4155 


99 


780 


Al6"78S82 


Homo sapiens 


CIiJ130E4 . 2 (KIAA0796 ) 


1321 


68 


781 


Z7S95S 


Caenorhabdit 
ie elegans 


Similar to mitochondrial 
carrier protein 


384 


34 


782 


AIil0996S 


Homo 
sapiens 


cloril21G12.2 (SCAN domain- 
Containing 1 protein) 


900 


100 


783 


AF061262 


Mus 

nuisculus 


semap cytoplasmic domain 
associated protein 2 


1316 


83 


784 


G03373 


Homo sapiens 


Human secreted protein, SEQ 


649 


95 
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TABLE 2 



PCT/USOO/34263 



) 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 




J. IH- 
WATERKAN 


IDENTITY 








ID NO: 7954. 






765 


y84441 


Homo £3pl6ns 


• ■* aK^J^xA O c* J J ^ C- JL C* 

human RNA- associated 
protein. 


2074 


100 


786 


Y00918 


Homo eap i ens 


Human Rab protein, RABP-l, 
protein sequence . 


1048 


39 


787 


Z97029 


Homo sapiens 


ribonuclease HI large subunit 


1548 


99 


788 


AB035384 


Homo sapiens 


SRp25 nuclear protein 


962 


iS4 


789 


AF024fi3a 


Homo sapisns 


ANG2 


2644 


100 >- 


790 


AJ006710 


norvegicus 


ohosoha t ifivl 1 nnQ i — Ic ■! n^»<i*» 


4500 


97 


792 


V0063B 


ci c ti G 3ri op g 
e lambda 




600 


100 


793 


AF049103 




riuiiL>ingu4.ii inceraccxng 
protein 


819 


100 


795 


Z26317 


Homo sapiens 


desmoglein 2 


4810 


99 


796 


Y76884 


Homo sapiens 


Retinoblastoma binding 
protein-7sequence . 


5080 


99 




Ul 5 i 55 


Gallus 
gallus 


txypsinogen 


372 


37 


798 


U9 7189 


Caenorhabdi t. 
is elftgans 


strong similarity to thw 
fXJ/Pi4 ramily or Kinases 


227 


28 


799 


AF112201 


•Homo sapiens 


neuronal protein NP2S 


1053 


i6o 


800 


Ar /t>b 


Ratitus 
norvegicus 


eerine-arginine-rich splicing 
regulatory protein SRRP86 


958 


63 


SOX 


Ar /tSi}^ 


Homo sapiens 


placental protein 13 -like 
protein 


743 


99 


802 


,r\J^ A u o w ^ 


Homo sapiens 


ri«-uU9 


7G6 


80 


803 


281097 


Caenoxhalidi ti 
is elegans 


Similarity to Human 
retinoblastoma -binding 
pzrouem KoAf 4b yxobzaXji , b 
comes from this gene 


152 


27 


804 


002113 


Homo sapiens 


Human secreted protein, SEQ 


496 


98 


805 


AL12ie73 


Homo sapiens 


bA30SP22-l (novel protein) 


1160 


ICO , 


806 


AC013463 


thaliana 


puc^wive i^xfc'ase activacor 
protein 


264 


30 


807 


AC013483 


AiTa b i dop s i s 
thaliana 


protein 


264 


30 


808 


AB0138eS 


Homo sapiens 




1494 


100 


809 


AF078a42 


Homo sapiens 


HOTTIi protein 


1581 


99 


810 


AF161421 


Homo sapiens 


HSPC303 


2X34 


96 


8X1 


AF261689 


Homo sapiens 


nNA. Tin1 VTn*>T"S fit^ ^ir^e^ T *-»n nil 

ii/iw ^uxyiUBx.osc; Cj^&JLXOXl px / 

subunit 


734 


100 


812 


Z74029 


Caenorhabdit 
is elegans 


Similarity to C. elegans 
alcohol dehydrogenase comes 
from this gene 


610 


71 


813 


273497 


Homo sapiens 


CU240C2.2 (Core his tone 
H2A/H2B/H3/H4) 


324 


100 


814 


W87689 


Homo 
sapiens 


Humem HTXFTi9 polypeptide. 


1464 


99 


eis 


X16282 


Homo 
sapiens 


zinc finger protein (217 AA) 
<l is 2nd base in codon) 


1109 


99 




Z92539 


Mycobacteria 
m 

tuberculosis 


pth 


300 


36 


818 


AB030483 


Mus musculus 


B9 


197 


27 


819 


AL117555 


Homo sapiens 


hypothetical protein 


321 


94 


820 


AC005328 


Homo sapiens 


R26660_2, partial CDs 


865 


97 


821 


G03 951 


Homo sapiens 


Human secreted protean, SEQ 
ID NO: 8032. 


700 


99- 


822 


L34e07 


Mueca 
domes tica 


transposase 


174 


20 


823 


G02928 


Homo sapiens 


Human secreted protein, SETQ 
ID NO: 7009. 


558 


78 


824 


Z99531 


Scliizosaccha 


caffeine -induced death 


184 


29 4 



163 



A 
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TABLE 2 



PCT/USOO/34263 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


O ^ OV* J- £j O 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 






roinyces 
pombe 


pjrotein 1 






825 ' 


AJ006692 


Homo sapiens 


ultira hioli sni f v<av"5it--!n 


6 93 


e^8 ■ 


826 


U23037 


Oryctblagus 
cuniculus 


elF-2Bepsilon 


3406 


90 


827 


G03412 


Homo sapiens 


Human secreted protein, SEQ 
ID NO- 74 91 


464 


100 


828 


y30327 


Homo sapi-ens 


Human secreted protein 


113 


44 * 


829 


y32199 




oufnctn irecepcoir tnoxecuxe (RECJ 
encoded by Incyte clone 
2022379 . 


1012 


100' 


830 


VJ78279 


Homo sapiens 


Fragment oi human secreted 
protein encoded by gene 33. 


1264 


99 


832 


AB011S42 


Homo sajpiens 


■ "TwEyTpq" — — — — -— ■ ' 


;<i097 


100 




G02639 


Homo sapiens 


Human secreted protein, SEQ 


223 


70 


834 


AF119664 


Homo sapiens 


transcriptional regulator 
protein HCNGP 


1574 


100 


835 


AFX19664 


Homo sapiens 


transcriptional regulator 
proue in hcngp 


1144 


89 


836 


AP119664 


Homo sapiens: 


transcriptional regulator 
protein HCNGP 


1448 


94 


83 7 


X12517 


Homo sapiens 


C protein (AA 1-1S9J 


918 


100 


638 


U32865 


Drosophila 
me X ^ no^ a s ce 3r 


ixnotte protein 


164 


24 


839 
840 


U27831 




tIjS- associated protein TASR-2 
striatum-enrxched phosphatase 


631 


56 


841 


AF286366 


Homo sapiens 


CamKt-iifce protein kinase 


2840 
1796 


98 
100 


842 

Ui 


G02309 


Homo ssLpiend 


Human secreted protein, SEQ 


278 


96 




AE003S1S 


Drosophila 
me 1 £i no^'a s e IT 


ade3 gene product 


113 


48 


844 


G013S0 


Homo sapiens 


Human secreted protein, SEQ 

XU (\Vy« 34 JX, 


629 


ICO 


845 
84 7 


U2783a 
¥87788 


Mus musculus 
Homo sapiens 


glycosyl -phosphatidyl - 
inositol -anchored protein 
homolog 

Human RBP-26 protein. 


3305 


96 


848 


AF164 794 


Homo sapiens 


Dif £33 protein homolog 


2026 
2398 


100 
100 


849 


U413XS 


Homo sapiens 


ZNF127-XD ^ 


2458 


93 


850 


AF192 784 


Homo sapiens 


makorin 1 


'2062 


97 


dsx 


y58^28 


Homo sapiens 


Protein regulating gene 


1548 


100 


852 


Z22968 ' " 


Homo sapiens 


M130 antigen 


6205 


100 


8S3 


222971 


Homo sapiens 


M130 antio'en ^v^r•ar»*a^ i ni av 
variant 


6380 


100 


8S4 


0033^2 


Homo sapiens 


Human secreted pr-otein, SEQ 
ID NO: 7443, 


330 


yt> 


855 
855 


G033 62 
AF28S118 


Homo sapiens 
Homo sapiens 


Human secreted protein, SEQ 

ID NO: 7443. 

CGI-203 


203 


100 


857 


AC;U06069 


Arabidopsie 
thaliema 


putative cleavage and 
polyadenylation specif ity 
factor 


452 
1383 


100 
55 


8S8 


AL02154e 


Homo sapiens 


Cytochrome c Oxidase 
Polypeptide Via -liver 
precursor (EC 1.9,3,1) 


593 


100 


859 


L529S6 


Xenopus 
laevis 


ribonucleoprotein 


1664 


85 


860 


AE'201947 


Homo sapiens 


MEK binding partne:; i 


616 


100 


861 


L31783 


Mus musculus 


uridine kinase 


1266 


92 


862 


AF161472 


Homo sapiens 


HSPC123 


602 


73 


863 


Z49068 


Caenorhabdit 
is elegans 


ra:Ltochondrial carrier protein 


3 70 


43 


864 


?lF154i08 


Homo sapiens 


tumor necrosis factor t^^pe i 


3559 


99 



164 
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TABLE 2 
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SEQ 
ID 
NO: 


MUMBER 




DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

Identity 








irecftptoiT associated ptotcin 






865 


AE001530 


Helicobacter 
pylori J99 


putative 


230 


32 


866 


XS7807 


Homo sfl-piens 


'*•*•-'-*■* Jt tmujiuwa J. J. y lit. 

chain 


699 


91 


867 


AL031673 


Homo sapiens 


dJ694B14.1 (PUTATIVE novel 
KRAB box protein with 18 C2H2 


4066 


99 


868 


Y11652 


Homo sapiens 


phosphate cyclase 


238 


100 V 


863 


^rvt ^ ct J yj 




high*"0lucos 6" regulated 
protein 8 


3041 


99 


870 


AB020648 


Homo sapiens 


KIAA0841 protein 


3237 


S9 


87i 




Homo SB.plens 


dJ167A19.1 (novel protein) 


1608 


100 


B 72 




Homo sQ,'px&xi& 


core his tone macroH2A2 . 2 


1866 


100 


O / J 


/VJUUZ± J J X 


Homo sa,piens 


ctJ3t>6N23.1 (pucative C. 
elegans UNC-93 (protein 1, 
C46F11.1) I/IKE protein) 


1129 


100 


074 


X14608 


Homo sapiens 


propionyl-CoA carboxylase 


3579 


100 


875 


AL117334 


Homo sapiens 


dJ687Fll.l (novel protein 
(part of translation of cDNA 
DKFZp434N061, Em:AL110249) ) 


306 


100 


876 


X79489 


Sac char omyce 
B cerevisiae 


E-925 protein 


446 


35 


877 


YS3001 


Homo sapiens 


Human secreted protein clone 
dne34_i protein sequence seq 
ID NO: 8. 


811 


100 


tJ /o 


/iff z oXUo4 


Homo sapiens 


CHMPl . S 


9S7 


100 


879 


X79417 


Sug acrofa 


40S ribosomal protein S12 


687 


100 


080 


AP001317 


Saccharomyce 
s cexevisiae 


Soilp 


478 


28 


6SX 


y8727S 


Homo sapiens 


Human signal peptide 
containing protein HSPP-.52 
Sc.U ID NO: 52. 


2547 


XOO 


832 


M14036 


Homo sapiens 


CI -inhibitor 


598 


77 


833 


AB041261 


Homo sapiens 


calcium- independent 
pho£ipholipase A2 


2903 




834 


AF020313 


Mus musculus 


proline -rich protein 48 


999 


84 


885 


Y10936 


Homo sapiens 


hypothetical protein 


1X04 


99 


886 




Ku3 mus cuius 


myofcubularin related protein 
1 


B66 


36 


867 


Y57893 


Homo s ap 1 ens 


Huo^n tran^embrane protein 

WTMPM-1 T 


1099 


94 


888 


AL11763S 


Homo sapiens 


hypothetical protein 


929 


99 


"889 ■ 


AF210317 


Homo sapiens 


facilitative glucose 
transporter family member 


2046 


99 


890 


Y36031 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
416. 


S83 


100 


891 


Y36031 


Homo sapiens 


Extended human secreted 
^jLi^irtsxn ;^BC}uence, xu HO- 
416. 


192 


S7 


892 


AF237631 


Homo sapiens 


ubiquitous tropomoduiin 13- 
Tmod 


1798 


100 


893 


AF090929 


Homo sapiens 


PR00477p 


6S3 


99 


894 " 


AL031228 


Homo sapiens 


dJ1033B10.2 (frID40 protein 
BING4 (similar to S. 
cerevisiae YER0S2C» M. sexta 
MNGIO and C. elegans F28D1.1) 


3196 


100 


895 


AIi031228 


Homo sapiens 


dJ1033B10.2 (WD40 protein 
BING4 (similar Co S. 
cerevisiae YER082C,> M. sexta 
MMGIO and C. elegans F28Di,i) 


2825 


96 


696 


AF171102 


Homo sapiens 


retinal degeneration B beta 


1302 


9S 


897 


AE003S51 


Drosophiia 
melanogaster 


CGI a 176 gene product 


633 


33 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


1 SMITH- 
WATERMAN 
SCORE 


IDENTITY 


898 


AJ237946 


Homo sapiens 


DEAD Box Protein 5 


2443 


100 


899 


Z97184 


Homo sapiens 


EKE2 


624 


100 — ~ 


900 


Z97184 


Homo sapiens 


KKE2 


409 


98 


903L 


AJ245587 


Homo sapiens 


Kruppel-type zinc finger 


1942 


100 


902 


AF091034 


Homo sapiens 


GTP-binding protein RAB22A 


1011 


100 , 


903 


R9S9S3 


Homo sapiens 


Eukaryotic cell growth 
inhibiting factor . 


414 


96 


904 


L04733 


Homo sapiens 


kinesin light chain 


1936 


72 


905 


AE003S40 


Drosophila 
melanogaster 


CG10984 gene product 


446 


33 


906 


M55542 


Homo sapiens 


guanylate binding protein 
i so form I 


2993 


98 


907 


MS 5 54 2 


Homo sapiens 


guanylate binding protein 
isoform I 


2901 


96 


908 


W04O8S 


Homo sapiens 


Human membrane fusion protein 
WDProl , 


1889 


100 


909 


AF168676 


Homo 
sapiens 


TNF intracellular domain- 
interacting protein 


647 


100 


910 


AB0291S0 " 




KRAB zinc finger protein 
HFBlOir, 


i2196 


100 


911 


G02S71 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6952. 


S21 


100 


912 


003162 


Homo sapiens 


Human secreted protein, SEQ 
ID WO: 7243, 


387 


87 


913 




Homo 

sapiens] 

>Y92508 

APR-2000 06- 
OCT-1998 
Human OXRE- 
5 . {Homo 
sapiens 


dTDP-4 -keto - 6 - cleoxy- D- glucose 
4 -reductase 


1710 


100 


914 


U24189 


Caenorhabdit 
is elegans 


hypothetical protein 1207-1; 
Method: conceptual 
translation aupolied by - 
authors ' ^' ' 


244 

t 


41 


915 


y 02591 


Homo sapiens 


A human progesterone receptor 
complex p23-like protein. 


843 


99 


915 


AE0009B4 


Archaeoglobu 
s fulgidus 


dlnitrogenase reductase 
activating glycohydrolase 
(draG) 




"26 


913 


M23159 


Cricetus 
cricetus 


DHFR-coamplif ied protein 


153 


30 


919 


Iil2018 


Caenorh^diC 
is elegans 


putative 


1232 


4i 


920 


AF102177 


Homo sapiens 


tumor antigen SI/P-8p 


1260 


97 


921 


AI>096712 


Homo sapiens 


dJ744I24.2 (similar to a 
novel human gene mapping to 
Activator) 


1017 


78 


922 


AL161495 


Arabidopsis 
t ha liana 


putative WD-repeat protein 


866 


42 


923 




Arabidops i s 
t ha liana 


putative WD-repeat protein 


442 


36 


924 


U97001 


Caenorhabdit 
is elegans 


similar to 

Schizosaccharomyces pombe 


605 


51 


925 


X71978 


Mus musculus 


Fi£ 


1503 


95 


926 


N:9228a 


Droaophila 
melanogaster 


beta-spectrin 


2:90 


51 


927 


Y27575 


Homo sapiens 


Human secreted protein 
encoded by gene No. 9. 


1392 


100 


928 


Y22499 


Homo sapiens 


Human eecreted protein 
sequence clone mh703__l. 


2249 


100 


930 


AJ22432G 


Homo sapiens 


r ibu 1 ose - 5 - phos pha te - 
epimerase 


912 


100 


931 


U28991 


Caenorhabdit 


coded for by C. elegans cDNA 


660 


ts-^ 
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TABLE 2 



SBQ 
ID 
NO : 


ACCESSION 
NUMBER 


SPECIES 
is elegans 


:3eSCRlPTION 

cm21c7 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


932 
933 


G01384 


Homo sapiens 
Homo sapiens 


hypothetical protein 

Human secreted protein, SEQ 

ID NO: 5965. 


210 

767 


25 
98 


934 


AJ276485 


Homo sapiens 


integral membrane transporter 
protein 


1200 


100 


935 
93 S 


AI*035681 
AB026B0B 


Homo sapiens 


dJ756G23.3 (novel protein 
similar to drooophila 
transcriptional repressor) 


1142 


60 


937 


AB015345 


Homo sapiens 


syiicipuoucicfrnxn Ax 
HRIHFB22i.6 


2142 
2601 


95 
99 


938 


X65724 


Homo sapiens 


ORF2 


498 


100 


933 


Vr89024 


Homo sapiens 


Polypeptide - ragmen t encoded 
by gene 156. 


1487 


100 


940 


G04047 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8128. 


117 


100 


941 


AF094SB3 


Homo sapiens 


putative Hiv-i infection 
related protein 


452 


100 


94 2 
943 


AC024200 
AF129756 


Caenorhabdit 
is elegans 

Homo sapiens 


contains similarity to 
several zinc finger proteins 
but not to the zinc finger 
domains 

G5C 


350 


69 


944 


K2376S 


Rattu© 
aotvegicus 


alpha - tropomyos in 


273 
133 


100 
96 


94 5 


AC009917 


Arabidopsls 
thaliana 


Contains similarity to 


583 


47 


946 


AF223468 


Homo sapiens 


AD02a. protein 


551 


4^ 


947 
948 


AF6S5473 
X7S756 


Homo sapiens 
Homo sap i e ns 


GAGE- 8 

protein Jcinase C mu 


273 
2019 


51 


949 
950 


AF1439S6 
y36729 


Mus oais cuius 

Homo 

sapiens 


corcnin-2 

Hunan PGl protein sequence. 


2300 
1861 


93 
99 


951 


W49041 


Homo sapiens 


Human low density lipoprotein 
binding protein rtBP-2, 


232 


67 


952 


ABoieeei 


Arabidopsis 
thaliana 


gene_ia:MXC17 . 7- 


203 


46 ^'^r 


953 


Y0178S 


Homo sapiens 


Human ubiquitin-conjugating 
enzyme >Y25341 Y2S341 Ol-JUL- 
1999 12-AUG-1998 Human NCE-2 
protein. 


3^5 


100 


954 


AF145615 


Drosophila 
inclanogaster 


f^s^i^ii^n #\ynw ^ ^ ft 


823 


46 


955 


U09410 


Homo sapiens 


Zinc finger protein fcw?l5l 


2483 


99 


956 


U094I0 


Homo sap a ens 


zxnc fanger protein 2JMF131 


1853 


99 


957 


AF195623 


Homo sapiens 


cbolinephospho transferase 1 
alpha 


2126 


99 


958 


X94917 


Droaophila 
melanogaster 


head-elevated expression in 
0.9 kb 


155 


32 


959 
960 


U54807 

Ti cn c o o l^ T 
At Uboo 07 


Rattus 
noirvegicue 
Bos taurus 


GTP-binding protein 
GTP -binding protein rah 


1167 


97 


961 
962 


G03244 
AFQ788S0 


Homo sapiens 
Homo sapiens 


Human secreted protein, SEQ 


606 
471 


97 
100 


963 


AP0017S4 


Homo sapiens 


Steroid dehydrogenase homolog 
transient receptor potential- 
related channel 1, a novel 
putative Ca2+ channel protein 


583 
317 


40 

'30 


964 


AL035419 


Homo sapiens 


dJllOOHia.l (putative novel 
protein) 


1129 


100 


965 


X61381 


RattUG 
rattus 


interferon -induced protein 


202 


46 


966 
967 


D38169 


Homo 
sapiens 


inositol 1,4. S-trisphosphate 
3-)cinase isoenzyme 


3278 


100 




AL031432 


Homo 
sapiens 


dJ46SN24.2.1 (PUTATIVE novel 
protein) (isoforro 1) 


893 


100 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


X, X J. z 


968 


U7927S 


Homo sapiens 


unknown 


611 


100 


369 


AJ01I306 


Homo 
sapiens 


guanine nucleotide exchange 
factor (long isoform) 


2752 


99 


970 


AF281134 


Homo sapiens 


exosome component Rrp4^ 


1186 


100 


971 


USi33^ • 


Ca enorhabdi t 
is elegans 


weak similarity over a ohort 
region to myosin heavy chain 


536 


23 


972 


AC018749 


Leishmania 
major 


L8840.12 


589 


53 


973 


AF188504 


Mus mus cuius 


LNV 


544 


85 


974 


U25801 


Homo sapiens 


Taxi binding protein 


852 


98 


975 


AF049523 


Homo sapiens 
1 


hunt mgtin- interacting 
protein HyPA/PBPll 


1390 


St 
^ / 


976 


AF161530 


Homo sapiens 


HSPC182 


1040 




977 


G04 020 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: BlOl, 


626 


100 


978 


AF164797 


Homo sapiens 


ribosomal protein L17 isolog 


908 


100 


979 


U94991 


Xenopus 
laevis 


transcription factor XJLMOi 


795 


97 


980 


S73775 


Homo fiani^rt«2 


calmitine; calseguestrina 


2029 


100 


981 


^94886 


sapiens 


Human protein clone HP01462. 


2S01 


xoo 


982 


AJ243191 




b.eat shock, protein 


827 


9^ 


983. 


X6S020 


Bos taurus 


PSST subunit of the NADH: 
"ubiquinone ojcidoreductase 


964 




984 


AJ249207 


Sp , AD4 5 


putative racemase 


351 


43 


965 


Z30053 


Homo sapiens 


i>asic transcription factor 2» 
35 kD subunit 


1576 


99 


dB6 


AB030835 


Homo sapiens 


v.wiAv.csxub ^xu&anvme rxcn 
domains, three zinc- finger 
domains . and mati"in *i 
homologous domain 3 (MH3} 


4697 


99 


987 


AF2272S8 


Bos taurus 


RPGR-iateracting protein-1 


"1262 


38 


988 


AL022238 


Homo sapiens 


dJ1042K10.2 (supported by 
GENSCAN, FGENES^ and GENEWISB) 


4048 


99 K 


989 


AL022238 


Homo sapiens 


dai042K10.2 (cupported by 
GENSCAN, FGENES and GETWEWISB) 


2321 


99 


390 


AF161426 


Homo sapiens 


HSPC308 


448 


92 


991 


AF161426 


Homo sapiens 


HSPC308 


448 


92 


992 


AP161426 


Homo sapiens 


HSPC308 


4 S3 


92 


993 


AIi023859 


Schizosaccha 

arcmycea 

pombe 


trna- splicing endonuclease 
subunit 


172 


42 


994 


AL049631 


Homo sapiens 


dCr513M9-l (novel Homeobox 
domain protein) 


241 


47 


995 


AGO 052 S3 


Homo sapiens 


R26445 i 


902 


100 


996 


AF265206 


Homo sapiens 


MOGl isoform A 


974 


100 


997 


AJ248285 


Eyrococcus 
abyssi 


sarcasine oxidase, subunit 
beta (soxB) 


195 


28 


998 


AE003641 


Drosophila 
melanogaster 


BG:DS00941.3 gene product ' 


218 


58 


999 


WS9343 


Homo 
sapiens 


Secreted protein of clone 
CR930_1. 


1340 


9B 


1000 


Ay007135 


Homo aaplens 


similar to bovine ADP/ATP 
translocase Tl mRNA with 
GenBank Accession Number 
M24102,l 


1S43 


100 


1001 


Y73381 


Homo sapiens 


HTRM clone 1877278 protein 
sequence . 


1666 


100 


1002 


AF208844 


Homo sapiens 


BM-002 


428 


100 


1003 


AE004944 


Pseudomonas 
aeruginosa 


hypothetical protein 


134 


35 


1004 


AL031431 


Homo sapiens 


d0r462O23.2 (novel protein) 


2058 


100 


1005 


S45367 


Canis 
familiaris 


centractin 


1949 


100 ^ 



168 



MSDOCID: <^NO____ _0^533^2A ' I _> 



PCT/USOO/34263 

TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITy 


looe 


S4S367 


Canis 

fatniliaris 


centractin 


1315 


98 


1007 


AB022158 


Mus 

mus cuius 


chaperonin containing TCP-1 
epsilon subunit 


2649 


96 


1008 


Y76332 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 38. 


1282 


97 


1009 


AB011414 


Homo sapiens 


Kruppel-type zinc finger 
protein 


1671 


58 


1010 


Z58218 


Caenorhabdit 
is elegans 


K01H12,1 


269 


67 • 


1011 


AB011414 


Homo sapiens 


kruppel-type zinc finger 
protein 


1571 


58 


1012 


Z14000 


Homo sapiens 


RINGl 


2017 


100 


1013 


002841 


Homo sapiens 


ID NO: 6922, 


332 


93 


1014 


AF14S659 " 


DrosopJiila 
melanocfasteir 


BcDNA,GH10333 


124 4 


52 


1015 


Y02&60 


Homo sapiens 


Fragment of human secreted 
jtr* dicoaea ny gene 65. 


664 


67 


10X6 


Y02591 




A human progesterone receptor 
complex p23-like protein. 


772 


97 


1017 


y99448 


Homo sapiens 


Human PR01759 (UNQ832) amino 
acid sequence SEQ ID HO:374. 


2323 


100 


1018 


X67250 


noirvegficus 


n- ch imae r 1 n 

_ 


1710 


97 


1019 


AF183417 


HCMTIO 

sapiens 


mxcrocuDuie— associated 
proteins lA/lB light chain 3 


631 


XOO 


1020 


AF16479S 


Homo sapiens 


sex- regulated protein j anus -a 


674 


100 


1021 


AFXB062B 


Coturnix 
cotujrni x 


qdga-1 


638 


96 


1022 


AL133363 


Aratoldopsis 
klialiana 


putative protein 


155 




1023 


AB034912 


Homo sapiens 


WD- repeat like sequence 


2483 


XOO 


1024 


Ay007091 


Homo sapiens 


similar to Homo sapiens 
mammalian Inositol 
hexalcisohosnhAhfa V4vtAaa 
{IP6K2) mRNA with Ge 


2243 


XOO 


102S 


X69910 


Homo sapiens 


P63 protein 


2958 


99 


1026 


U8073^ 


Homo sap X ens 


CAGP9 


1657 


XOO 


1027 


AB029333 


Halocynthia 
roretzi 


HrPET- 1 


1048 


54 


1028 


AB032931 


Hotio sapienis 


ubiquitin-conjugating enzyme 
isolog 


1045 


100 


1025 


G01797 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5878. 


749 


98 


1030 


G01797 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5378. 


749 


98 


1031 


AFi93 795 


Homo sapiens 


vacuolar sorting protein 
VPS29/PEP11 


960 


loo 


1032 


AJ22296a 


Mus musculus 


L-periaxin 


120 


30 


1033 


Z81317 


Schizosaccha 

romyces 

pombe 


DNA2-NAM7 helicase family 
protein 


685 


31 


1034 
103S 


y41519 
XJ27G004 


Homo sapiens 
Mus muscuiua 


Fragment of human secreted 
protein encoded by gene 75. 
Paxneb protein 


1321 


99 


1036 


AF02S4S9 


Caenorhabdit 
is clegans 


H14A12.3 gene product 


1709 
190 


77 
30 


1037 


U3 7251 


Homo sapiens 


Description: KRAB zinc finger 
protein; this is a splicing 
supplied by author 


196 


43 


1038 


W74S80 


Homo 
sapiens 


Human membrane protean 
BA0306. 


1921 


97 


1039 


aa8173 


caenorhabdit 
is elegans 


weaJc similarity to 
Arabidopsis thaliana 
ubiquitin-like protein 8 


331 


SO 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


1O40 


At'290 2 04 


Homo sapiens 


blood group carrier molecule — 
POKl 


1637 


99 


1041 
1042 


Y9673 0 
AF140683 


Homo 

sapiens 

Mus musculus 


PROS39, a Costal -2 homologue. 
F-box protein FWD2 


162 
2397 


" 98 . 


1043 " 
1044 


AF151023 " 

AFiaiissi 


Homo sapiens 

Drosophila 

melanogaster 


HSPC189 
BCDKA.GH04929 


1104 
204 


100 
37 


1045 
1046 


y77985 
AJ243972 


Horao sapiens 
Homo sapiens 


j-u(iiciii (-'WO. -usty aZJlxnO aCiCi 

sequence . 


1940 


iocf 


1047 


AB035863 " 


Homo sapiens 


ATP opecitlc succinyi CoA 
synthetase beta subunit 
precursor 


1317 
2324 


100 
99 


X048 
1049 


AL034SS0 
AF163a25 


Homo sapiens 
Homo sapx6ns 


aJnB4£'4.2 (novel protein 
similar to nucleolar protein 
4 .:kOL4) (NOLP)) 
pre-B lymphocyte protein 3 


981 


92 


1050 
1051 


AF201949 
AF190624 


Homo sapiens 
Mus musculus 


60S ribosomal protein 1^30 

isolog 

mdgl-1 


634 
868 


XOO 
100 


1052 


AE003S29 


DrosopKila 
melanogaster 


CG6151 gene product 


236 
160 


85 
44 


1053 


G01191 


Homo sapiens 


Human secreted protein^ SEQ 
ID KOt 5272. 


646 


98 


1054 

ios6 


AIj162756 


Neisseria 
meningitidis 


Glu-tRNA{Gin) 

amidotransf erase subunit A 


682 


44 




AF1818S6 


i^attus 


tJlNA eelenocyeteine 
associated protein 


1525 


99 


1056 


U89649 


Chi amydomona 
s 

reinhardtii 


Mrl9,000 outer arra dynein 
light chain 


244 


34 


1057 


AF159141 


Homo sapiens 


breast cancer metastasis- 

3^ SA^^ £ l3Sk S S^Jm Jit 


663 


S3 


1058 
1059 


AF230929 
A*J270952 


Homo 
sapiens 
Homo sapiens 


keratinocyte annexin-like 
protein pemphaxin 


1710 


99 


1050 

loei 


AF224263 
X63417 


Heterodontus 
f rancisci 
Homo sapiens 


^uuauive meHiorane protein 
HoxDB 

IRLB ' — 


1363 
'742 


100 
83 


1062 


AI.07934S 


S t rep t omy ces 
coelicolor 
A3 (2) 


iiy cins t i k-c* J. procein 


1037 
143 


100 
27 


1063 
1064 


Y71112 
Bin CI A 


Homo sapiens 
Homo sapiens 


Human Hydrolase protein-lo 
(HYDRL-IO) . 

acetyl -CoA synthetase 


2547 


100 


106S 


yi3356 


Homo sapiens 


Amino acid sequence o£ 
protein PR0221 , 


3493 
1363 


99 
100 


1066 


AC006153 


Homo sapiens 


similar to Aquirex aeoiicus 
GTP-binding protein; similar 
to AE000771 (PIDjg2984292) 


^62 


98 


1067 


Y18930 


Sulfolobus 
solfataricus 


hypothetical protein 


162 


29 


1068 


R6S969 


Homo 

sapiens T98G 


Gl ioblas toma -derived 
polypeptide. 


887 


IDO 


1069 


Y07964 


Homo sapiens 


Human oecreteii protein 
fragment 


863 


96 


1070 
1071 


AF177476 
AF245S0S 


Rattus 
norvegicus 
Homo sapiens 


CDK5 activator-binding 

protein 

adiican 


1995 


86 


1072 


U92794 


Hus musculus 


alpha glucosidase II, beta 
subunit 


3109 
147 


99 
36 


1073 

1074 
1075 


S03889 

[J15779 
?i3392 


Homo sapiens 

iomo sapiens 
Komo sapiens 


Human secreted protein, seq 

ID NO: 7970. 

p70 

ftmino acid sequence oi 


698 

380 
1271 


98 

2^ 1 
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TABLE 2 
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SEQ 
ID 
NO: 

1076 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 
protein PR0328. 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


1077 
1078 


AF161457 
y79503 

AF223466 


Homo sapiens 
Homo sapiens 

Homo sapiens 


HS PC3 3 9 

Human carbonydrate- associated 
nrotein f^pnnn_c 

HT015 protein 


571 
2151 


100 
98 


1079 
1080 


AL132965 
AB024937 


Arabidopsis 
tha liana 
Homo sapiens 


putative wu-40 repeat-protein 
LUNX 


831 
286 


66 
29 


lOBl 


Y14768 


Homo sapiens 


V-ATPase G-sutaunit like 
protein 


1284 
579 


100 


1032 
1083 


AF016416 
Iil329l 


Caenorhabdit 
is elegans 


F29A7,4 gene product 
ADP-ribosylarginine hydrolase 


141 
802 


31 
45 


1084 
1085 

1086 


AB041541 
G01922 

AB030814 


Mus musculus 
Homo sapiens 

Homo sapiens" 


unnamed protein product 
Human secreted protein, SEQ 
ID NO: 6003. 

H-REV107 protein homolog 


151 
■"202 


44 

97 


1087 


AF151638 


Homo sapiens 


phosphatidyl choline trans f er 
protein 


833 
114 2 


100 
100 


1088 


ya4432 


Homo sapiens 


Amino acid sequence a 
human RNA-associated 
protein. 


2783 


100 


1089 

1090 
1091 


y9486'> 
AB041S86 


Homo 
sapiens 
Homo sapiens 
Mus musculus 


Human protein clone HP10563.' " 

unnamed protein product 
unnamed protein product 


613 
130 


100 
49 


1092 
1093 


X71277 
034973 


Homo sapiens 
MuQ mus cuius 


Human Ziipo3 protein, 
protein tyrosine phosphatase- 
like 


1103 

606 

1131 


81 

ioo 

93 


1094 


Y66677 


Homo 
sapiens 


Membrane -bound protein 
PROS 2 a . 


S22 


S6 


1095 


ye 72 76 


Homo sapiens 


Human signal peptide 
containing protein HSPP-S3 
SEQ ID HO: 53. 


1029 


99 


1096 
1097 


y87276 
AF1614SS 


Homo sapiens 
Homo sapiens 


Human signal peptide 
containing protein HSPP-S3 
SEQ ID NO: 53. 

HSPC337 — 


863 




1098 

1099 
1100 
1101 


uaoo29 

iU005Q66 
AJ005865 
AJO05866 


uaenorftabdit 
is elegans 

H AiTm Ajst ^ ^ ^ 

Homo sapiens 
Homo sapiens 


simixar to thioredoxin 

Sqv-7-iiJce protein 
Sqv-7-liJce protein 
Sqv-7-like protein 


742 
242 

1321 
1118 


98 
39 

99 
99 


1102 
1103 
1104 


AJO05866 
ALII 0244 
AF242194 


Homo sapiens 
Homo sapiens 
Drosophila 
raelanogaster 


Sgv-7-lxJce protein 
Hypothetical protein 
braJceiess-B ' " 


891 

1016 

299 

147 ■ ■ 


99 
99 
31 


llOS 
1106 


ALO31010 


Homo sapiens 


aj422F24.1 {PUTATIVE novel 
protein similar to C. elegans 
■00202 .5) 


968 


xoo 


1107 


U28016 
AJ278150 


Mus nusculua 
Homo sapxens " 


parathion hydrolase 
(phosphotri est erase) -related 
protein 

putative 1 ipid kinase 


1624 


87 


ixbs 

1109 


G03733 


Homo sapiens 


Human secreted protein, SEQ 
ID WO: 7814. 


2207 
495 


99 
98 




AF217287 


Drosophila 
melanogaster 


S protein RhoBTB 


834 


54 - ■ 


1110 


y28921 


Homo ] 
sapiens ] 


Human regulatory protein 
EmGP-7. 


941 


48 


1111 

11X2 J 


ir28g21 

^F176704 " J 


tiorao ] 
sapiens j 
■lomo sapiens i 


^umaii regulatory protein 
iRGP-7. 

^-box protein FBX9 


1331 


51 


11X3 J 
11X4 < 


\Fi82076 ] 
304039 I 


iorao c 
sapiens < 
fomo sapiens } 


jiioma tumor suppressor 
randidate region protein 2 
iuroan secreted protein, SEQ 


2027 
2418 

475 


99 
100 

96— r- — 
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TABLE 2 
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SEQ 
ID 
NO: 


NUMB£r 




DESCRI PTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 








ID NO: 8120 . ~~~~ ' — — — — — 






1115 ^ 


AF2294 3 9 


Mua musculus 


zinc finger protein 289 


1697 


91 


1X16 


L4 0357 


Hotno sst^icTLS 


thyroid receptor interactor 


509 


100 


1117 


L40357 


Homo s dp jl 6 ns 


uiiy — irecepcor xnceractor 


404 


85 


1118 


A121S5 






1673 


100 ■ 


1119 


AIj161&42 


tha liana 


isotnerase like protein 


607 


53 


1120 


AIi023754 




uuz/^uxo,x vKac 

Ca2+ /Calmodulin dependent 

Protein Kinase LIKE protein) 


2341 


98 ; ■ ""■ 


1121 


y"S7901 


Homo Sapiens 


Human transmembrane protein 


321 


36 


1122 


214122 


Xenopus 
lac VI s 


XLCl.2 


455 


77 


1123 


AF225418 


Homo sapiens 


lipase 


1S31 


97 


1124 


y06518 


Homo sapiens 


Zen GTPaee interacting 
protein ZIP. 


3227 


100 


X12S 


AI*03S690 


Homo sapiens 


dJ202I2i-i (novel protein) 


952 


100 


1126 


AJ000217 


Homo sapiens 


CI.IC2 


12B6 


99 


1127 


AB03050S 


Mas musculus 


UBE-1C2 


1069 


79 


1128 


Y7357S 


Homo eapienp 


HTRM Clone 142783 8 protein 
sequence . 


874 


100 


1125 


y78941 


Homo sapiens 


Cyclophilin-type peptidyl 
prolyl cis/trans isomerase 
amino acid sequence * 


877 


100 


1130 


AL023553 


Homo sapiens 


da347Hl3.4 (novel protein) 


557 


100 


"nil 


y91945 


Homo sapiens 


Human chape rone protein 6 
(HCHP-6) . 


1408 


100 


1132 


Z68197 


Schlzosaccha 

rorayces 

pombe 


putative nuclear pore protein 


596 


39 


1133 


268197 


Schlzosaccha 

rocnyces 

pombe 


putative nuclear pore protein 


389 


35 


1134 


AF160681 


Homo sapiens 


guanine nucleotide exchange 
factor 


3597 


100 


113S 


AF079765 


Mus musculus 


enhancer of polycomb 


"264 ■ 


41 


1136 


M€l2419 


Mus musculus 


clathrin-associated protein 


2189 


99 


1137 


AJG06219 


Drosophila 
tndanocfasteir 


clathrin-associated protein 


1254 


78 


1138 


y7621B 


Homo sapiens 


Human secreted protein 
encoded by gene 95. 


440 


98 


1139 


W88104 


Homo 

fafltpitSHo 


A Rab protein designated 

nt<i\ae>~ 4. , 


1065 


99 


1140 


Y13401 


Homo sapiens 


Amxno acid sequence of 


3979 


98 


1141 


W8S026 


Chimeric - 
Homo sapiens 


Green fluorescent protein- 
*»c*P 'V jtusxon prouUCu • 


3309 


100 


1142 


yi3402 


Homo sapiens 


Amino acid sequence of 


1694 


99 


1143 


G03 87S 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 79S6. 


660 


99 


1144 


Y12917 


Homo sapiens 


Amxno acid sequence of a 
human secreted peptide. 


750 


98 


114S 


Y12917 


Homo sapiens 


Amino acid sequence of a 
human secreted peptide. 


1096 


100 


1146 


AIi022157 


Homo sapiens 


SPIN (SPINDLIN HOMOLOG 
(PROTEIN DXP34) ) 


1233 


100 


1147 


AL022157 " 


Homo sapiens 


SPIN (SPINDLIN HOMOLOG 
(PROTEIN DXP34)) 


1233 


100 


1148 


U02S48 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6629. 


370 


99 


1149 


y7333 8 


Homo sapiens 


HTRM clone 2019742 protein 
sequence. 


1492 


100 


1150 


W74 841 


Homo sapiens 


Human secreted protein 
encoded by gene 113 clone 


228 


55 f 
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TABLE 2 



SEQ 
ID 
NO: 


NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 








HEAAR60. 






1151 


AF044201 




neuiraX merrbirane pirotein 35; 
NMP3 5 


1570 


92 


1152 


AF1S6774 


sapiens 


1 \^ V\ /*^ ey^Vi af^nf^i^ •^rt-iJ 

xyitoptiospticit icii c acx CI 
acyl t ransf eraee-gamma 1 


18SS 


99 


1153 


AL118501 




t*" j-j- jx«j,t3 , A \t\ novel protein 
(translation of the cDNA 
DKFZp566A0946, Em : AL050069 ) ) 


872 


64 


1154 


AF131852 


Homo SH^x^ns 


CTnkn own 


473 


100 V 


1155 


Y41705 


Homo 
sapiens 


Human PR03S2 protein 
sequence- 


1381 


97 


1156 




Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8117. 


607 


99 


1157 


Jxr J, XA*dtH*i 


liUpxnus 
luteus 


Li- asparaginase 




43 


1158 


AF1S1B48 


Homo sapiens 


CGI- 90 protein 


232 


32 


1159 


Au2 72267 


Homo sapiens 


choline dehydrogenase 


2449 


100 


1160 


AB001773 


ctona 
savignyi 


PEM-6 


196 


33 


1161 
„..™., 


Y87330 


Homo sapiens 


Human signal peptide 
containing protein HSPP-107 
SEQ ID NO: 107. 


746 


S3 


1162 




Homo sapiens 


Human signal peptide 
containirig protein HSPP-107 
SEQ ID NO: 107. 


746 


63 




AF113S34 


Homo sapiens 


HP1-BP74 protein 


2723 


96 






Danio rerio 


Dcddl 


191 


41 


1165 


AIj118501 


Homo sapiens 


d0ril9lN16.1 (A novel protein 
(translation of the cDNA 
DKFZpS66A0946, Em:AL0S0069) ) 


1051 


71 


1166 




Homo sapiens 


CU1191N16.1 (A novel protein 
(translation of the cDNA 
DKFZp566A0346, Em: AIjO5O069) ) 


945 


75 


1167 


API 8 7 733 


Homo sapiens 


syntapiiilin 


83X 


42 


1168 


AB01943S 




phosphol ipase 


9S1 


55 


1169 


AP064604 


Homo sapiens 


kED3 protein 


324 


33 


1170 


Y01164 


Homo 63.pXG£15 


Foxypepciae fragment encoded 
by gene 6. 


1191 


100 


1171 


1^03188 


s cerevisiae 


putative 


180 


22 


1172 


AF113 751 " 


Mus musculus 


nuclear pore membrane 
glycoprotein POM210 


3341 


81 


117^ 


Aa245417 


Homo sapiens 


GSb protein 


794 


100 


1174 


AIi022238 




tiuiU4^ivj.u , 3 \novel protein) 


1285 


100 


1175 


U41278 


Is elegans 


F3 3G12 - 3 gene product 


332 


28 


1176 


M3S617 


Homo sapiens 


T-cell receptor V- alpha- J- 
alpha region 


284 


83 


1177 


AC012680 


Arabidopsis 
thai i ana 


putative protein phosphatase 
2C; 554S5-56414 


209 


37 


1178 


G01345 


Homo sapiens 


Human secreted protein^ SEQ 
ID NO: 5426. 


692 


99 


1179 


AL096767 


Homo sapiens 


dJS79N16.3 (novel m-nt-^^-in 
similar to worm, Arabidopsis 
and pine proteins) 




100 


1180 


AF039716 


Caenorhabdit 
is elegans 


similar to ATP synthase B 
chain 


496 


55 


1181 


Y11710 


Homo sapiens 


collagen type XIV 


1048 


97 


1182 


X&2240 


Homo 
sapiens] 
>R94 974 
R94974 09- 
MAY-1996 7.1- 
OCT- 1994 
Human TCL-i 
polypeptide. 


T cell ieukemia/lymphoma 1 


617 


100 
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SEQ 
ID 
NO ; 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


IDENTITY 






[Homo 
sapiens 








1183 


U42841 


Caenorhabdi t 
is elegans 


Short region of weak 
similarity to collagen 


161 


33 


1185 


AJ131613 


Homo sajpiens 


dicarboxylate carrier protein 


1470 


99 ^ 


1186 


L2764S 


Danio rerxo 


growth- associated protein 


130 


36 


1187 


Y02738 


Homo sapiens 


Human secreted protein 
encoded by gene 89 clone 
HLHFP03 . 


636 


100- 


lies 


AF217544 


Xenopus 
laevis 


ornithine decarboxylase- 2 


1459 


60 


1189 


Alil36307 


Komo aapiens 


dJ360B8.2 (Neuritin, a 
protein which promotes 
neurite outgrowth) 


182 


""33 


1190 


X89602 


Homo sapiens 


rTSbeta 


197 


100 


1X91 


U32828 


Haemophilus 

influenzae 

Rd 


ribosomal protein S6 
modification protein (rimK) 


268 


31 


1192 


AF154831 


Rattus 
norvegicus 


PV-l 


1403 


60 


1193 


y50926 


Homo sapiens 


Human fetal brain cDNA clone 
vcl6_l derived protein. 


918 


100 


1194 


AF02e530 


Rattus 
norvegicus 


stathmin-like-protein splice 
variant RB3 • • 


1093 


97 


1195 


U35244 


Ratcus 
norvegicus 


vacuolar protein sorting 
homolog r-vps33a 


2981 


9^ 


1196 


y70470 


Homo sapiens 


Hximan p53 target molecule, 
PRG3 protein. 


1680 


100 


1197 


AF1S7318 


Homo sapiens 


AD" 017 protein 


912 


47 


1198 


AF125443 


Caenorhabdi c 
is elegans 


contains similarity to S- 
powibe phosphatidyl synthase 
(GB:Z28295) 


460 


39 


1199 


AF201934 


Homo sapiens 


DC12 


1649 


86 


1200 


AI*03177S 


Homo sapiens 


dJ30M3.3 (novel protein 
similar to C. elegans 
Y63D3A.4) 


1902 


100 


1201 


M21103 


Ovis aries 


BIIIB4 high-sulfur- keratin 


484 


82 


1202 


285986 


Homo sapiens 


ciJ108KH,3 (similar to yeast 
suppressor protein SRP40) 


1143 


75 


1203 


U18762 


Rattus 
norvegicus 


retinoi dehydrogenase type I 


890 


52 


1204 


U35730 


Mus mus cuius 


Jerky 


2235 


76 


1205 


7VB002327 


Homo sapiens 


KIAA0329 


151 


24 


1206 


AB019233 


Arabidopsis 
thaliana 


ubi qui none /menaguinone 

biosynthesis 

methyl transf erase-like 


762 


56 


1207 




Homo sapiens 


<i«73 80B8.2 (Neuritin, a 
protein which promotes 
neurite outgrowth.) 


742 


100 


1206 


AF207989 


Homo sapiens 


orphan G-protein coupled 
receptor 


2326 


100 


1209 


297630 


Homo sapiens 


d(74 66Nl .4 (novel protein 
similar to AKK3 (ankyrin 3, 
node of Ranvier (ankyrin 
G))) 


181 


44 


1210 


U21S4 9 


Mus muaculus 


Ac3 9 /physophi 1 i n 


1280 


68 


1211 


Y27700 


Homo sapiens 


Human secreted protein 
encoded by gene No, 12. 


1267 


100 


1212 


AF117814 


Mus musculus 


odd- skipped related 1 protein 


945 




1213 


AF277233 


Naegleria 
fowleri 


calcineurin B 


222 


39 


1214 


D14849 


Mus musculus 


meiosis-specif ic nuclear 
structural protein 1 


1950 


77 


1215 


GO3022 


Homo sapiens 


Human secreted protein* SEQ 
ID NO: 7103 . 


590 


100 


1216 


Z72S10 


Caenorhabdi t 


Similarity to yea at UTR3 


634 


49 
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TABLE 2 



~ SEQ 
ID 
NO; 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
SCORE 


IDENTITY 






is elegans 


protein (Swiss Prot accession 
yk677hll,s comes from this 
gene 






1217 


Z49703 


Saccharomyce 
s cerevisiae 


unknown 


134 


' AA 


1218 
1219 


ACO13430 
L10910 


Arabidopsis 
thaliana 
Homo sapiens 


F3F9,18 " ■ " ■ 

splicing factor 


199 


29 


1220 


Z707S0 


Caenorhabdit 
is elegans 


similar to vanadate 
resistance protein 
transmembranous comes from 
this gene 


1026 
965 


71 

58 V 


1221 


AL163815 


Arabidopsis 
thaliana 


putative protein 


653 


61 


1222 


APlSSlOO 


Homo sapiens 


zinc finger protein KY-REN-21 
antigen 


2261 


xoo 


1223 


J0S071 


Bos taurus 


GTP- binding regulatory 
protein gamma -€ subunit 


356 


100 


1224 

122S 
1226 . 


Y73364 

AIj050170 
X64002 


Homo sapiens 

Homo sapiens 
Homo sapiens 


HTRM clone 276S991 protein 
sequence . 

hypothetical protein 
RAP74 


1169 '" 

714 
2661 


99 

ioo 

99 


1227 
1228 
1229 

1230 
1231 


X04085 

A0r005620 

AF045564 

JC97571 
1.08239 


Homo sapiens 
Mus masculus 
Rattus 
norvegicus 
Mus musculus 
Homo sapiens 


catalase — 
skeletal muscle-specific gene 
development -related protein 

ns-^iv xiiL.t:x4L.(,xng procexn 
located at OATLl 


2846 
1416 
171S 

479 
2274 


100 

90 

93 

96 
100 


1232 
"1233 
1234 


AF121863 
AF121863 
AC024805 


Homo sapiens 
Homo sapiens 
Caenorhabdit 
is elegans 


sorting nexin 14 
"^***»c»4»**o oiLuiixcincy CO 
TR:O0459S 


1964 
1203 
744 


100 

84 

31 


123S 


AC006^34 


Caenorhabdit 
is elegans 


contains similarity to 
Saccharomyccs cerevisiae 
probable meinbr'ane protein 
YLR418C (GB:U20162) 


357 


33 


1236 
1237 


yiBioi 

AB042646 


Mus musculus 
Homo sapiens 


macrophage actin-assc^< uated- 

tyrosine-phosphorylated 
protein 

•iMiP2 — 


1 CCQ 

1224 


87 
XOO 


1238 
1239 


AB026264 
AB026264 


"Homo sapiens 
Homo sapiens 


IMPACT 
IMPACT 


1694 
1123 


100 
100 


1240 


G00429 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 4510. 


324 ■ 


100 


1241 


Y76144 


Homo sapiens 


Human secreted protein 
encoded by gene 21. 


1363 


53 


1242 


AI.035602 


ArabiUopsts 
thaliana 


putative protein 


499' 


28 


1243 


X76483 


Gaiius 
galluG 


Yea-asaociated protein 
(SSkDa) 


574 


48 "~ 


1244 
1245 


AF220186 
AL0214S3 


Homo sapiens 
Homo sapiens 


uncharacterized hypothalamus 
protein HT012 

dJ821D11.3 (PUTATIVE protein) " 


503 
8S^ 


100 
10b 


1246 
1247 

"1248 


YS7910 


Homo sap X ens 
Homo sapiens 


GARl protein 

Human transmembrane ptotein 
HTMPN-34. 


1216 
1369 


100 
98 


"1*249 


AC0O4874 


Homo sapiens" 


similar to N- 

acetylgalactosaminyl transfers 
se; similar to Q07S37 
{PlD:gll71989) 


957 


100 


AF199597 


Homo 
sapiens 


A- type potassium channel 
modulatory protein 1 


1139 


100 


1250 


Y13148 


Rattus 
norvegicus 


PAG60B 


1350 


88 


12B1 


M24852 


Rattus 
norvegicus 


neuron -specific protexn PEP- 
19 


124 


46 
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TABLE 2 



SEQ 
ID 
NO: 


NUMBER 


SPECIES 


DESCRIPTION 


onJ. Ifl — 
WATERMAN 
SCORE 


IDENTITY 


1252 


AF14673e 


Rattus 
norvegicus 


testis specific protein 


771 '"' 


83 


12S3 


G02725 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6806. 


419 


g -J ■ ' "1 


12S4 


W44375 


Homo sapiens 


Human ubiquitin-con:jugating 
enzyme polypeptide. 


1045 


99 . 


125S 


AC0a6538 


Homo sapiens 


BC41195_1 


831 


78 . 


12S6 


AB004316 


Bos taurua 


mitochondrial methionyl- tRNA 
transformylase 


15S6 


88 ^ 


1257 


Z3S094 


Homo sapiens 


SURF-2 


1354 


97 


1258 


Y13362 


Homo sapiens 


Amino acid sequence ot 
protein PR0214. 


2383 


100 


1259 


AC006014 


Homo sapiens 


similar to RPP transforming 
protein,- similar to P14373 
(PID:gl32517) 


1299 


100 


1260 


ACOOS099 


Homo sapiens 


" niatch to AI 22 2572 ' ' 
(NID:g380477S) 


469 


i6b 


1261 


VO05O7 


Homo sapiens 


1st base in codon) (SSi is 
3rd base in codon) 


984 


100 


1262 


X15443 


Rattus sp. 


(AA 1-568) 


697 


32 


1263 


AP173871 


Mus muscuXus 


neuronal PAS 3 " 


9 v7 


* 94 


1264 


AF178983 


Homo aapieno 


Ras—assocxated 'protein Rapi 


433 


97 


1265 


Y70473 


Homo sapiens 


Human cyclic nucleotide- 
associated protein- i (CKAP- 
1) . 


2785 


99 


1266 


Y41738 


Homo 
sapiens 


Human PR0541 protein 
sequence . 


'1622 


"ICO 


1267 


AF061346 


iyiu9 muoculua 


Edpl protein 


1077 


64 


1268 


U97006 


Caenorhabdit 
is elegans 


C13F10 .4 cfene product ~" 


154 


23 


1269 


AF233582 


Mus musculus 


GTPase Kat>37 


"94i 


95 


1270 


AP195951 


Homo sapiens 


sxgnal recognition particle 
68 


3i27 


98 


1271 


AIi031177 


Homo sapiens 


dJ899Mi5.3 (novel protein) 


1150 


55 


1272 


AF201933 


Homo sapiens 


DCll 


650 


100 


1273 


AF201933 


Homo sapiens 


DCll 


346 


98 


1274 


AI.02171O 


Arabidopsis 
thaliar.a 


putative protein 


348 


49 


1275 


ACQ 04449 


Homo sapiens 


R336e3_3 


556 


100 


1276 


ya^29s 


Homo sapiens 


Human secreted protein 
HL2AGB7, SEQ ID NO: 210. 


1920 


100 


1277 


Y71111 


Homo sapiens 


Human Hyclrolase protein- 9 
(HYDRL-9) , 


1576 


99 


1278 


S94421 


Homo sapiens 


T cell receptor eta-exon 


476 


lOO 


1279 


Y66695 


Homo 
sapiens 


Membrane -bound protein 
PR01344 . 


1909 


100 


1280 


APieiBSO 


Homo sapiens 


lkSPC262 


772 


100 


1281 


Y48610 


Homo sapiens 


Human breast tumour- 
associated protein 71. 


779 


100 


1282 


AC015446 


Arabidopsis 
thalxana 


Similar to AlGi protein 


406 


35 


1283 


AKa24432 


Komo sapiens 


FLJ00022 protein 


4 03 


35 


1284 


W96153 


Homo sapiens 


Human FADD- interacting 
protein fFIP) . 


1825 


81 


1285 


Aa00lO19 


Homo sapiens 


ring finger protein 


1301 


100 


1286 


AE0C3823 


Drosophila 
melanogaster 


CG13178 gene product 


195 


29 


1287 


AP178632 


Homo sapiens 


FEM-i-like death receptor 
binding protein 


32'6X 


100 


1288 


AG006033 


Homo 
sapiens 


similar to MLH 64; similar to 

138027 (PID:g2135214) 


1195 


100 


1289 


ACO06O33 


Homo 
sapiens 


simxiar to MUJ 64; similar to 
138027 (PrD:g213S214) 


668 


93 


1290 


AB023eil 


Homo sapiens 


Ta3A 


:ii>l 1 54 1 
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TABLE 2 



SEQ 
ID 
WO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


iDENTiry 


1291 


Z73424 


Vki^cxG* iL/ t 4 let LJvJlJL U 

is elegans 




235 


36 


1292 


Y94871 


Homo 
•S sip 1. cms 


Human protein clone HP02SS1. 


1222 


100 


1293 


AF1B0425 


Horno sapisns 


protein RAP140 


489 


US 


1294 


G03856 




Human secreted protexn/ SEQ 


538 


99 


129S 


AF133670 


Mus musculus 


ARL-6 interacting protein-2 


367 


51 V 


1296 


AJ24973S 




cl.audin-6 


1142 


100.. 


1297 


X57560 


Escherichia 
coXi 


pspE protein 


535 


100 


1298 


AF169284 


Homo sapiens 


LIM and cysteine-rich domains 
protein 1 


1997 


100 


1299 


U41023 


Caenoirhabdi t 
is elegans 


coded for by C, elegans cDNA 
yk61fl.3; coded for by C. 
ykl09h8 - 5 


324 


■~P.9 


1300 


AB024S23 


Homo sapiens 


basic kruppei like factor 


1206 


100 


1301 


XS59B9 


Homo sapiens 


eosinophil cationic-related 
protein 


737 


99 


1302 




Homo sapiens 


unknown 


1481 


ico 


1303 


X52904 


•Escherichia 
coli 


open reading frame (AA 


359 


ioo 


1304 


U19577 


Escherichia 
coli 


galcictonate dehydratase 


242 


93 


1305 


AF266508 


Mus musculus 


NEliF protein 


1409 


97 


13 06 


y 5 7901 


Homo sapiens 


Human transmembrane protein 
HTMPN-2S . 


932 


100 


1307 


U58750 


Caenorhabdit 
~o elegans 


similar to the mitochondrial 
carrier family 


365 


54 


1308 


■AF044774 


Homo sapiens 


breakpoint cluster region 
protein 2 


2681 


99 


1309 


AI.078593 


Homo sapiens 


dJ210Bl.l (KIAAOSBO) 


26 7 


34 


1310 




Homo sapi.ens 


E48 antigen 


€20 


96 


1311 


Z82263 


Caenorhabdi t 
•Ls oJLcgazis 


C47A4.1 


283 


35 ,V 

y 


1312 


AFa312ia 


Homo sapiens 
— 


Chromosome 16 open reading 
frame S 


14 93 


100 


1313 


Y41763 


sapiens 


Human PR0936 protein 
sequence. 


1636 


100 


1314 


AF196972 


Homo sapiens 


JM24 protein 


2239 


100 


1315 


AP0S3356 


Homo sapiens 


insulin receptor su£>8trate 
like protein 


228 


97 


1316 


y6669S 


Homo 
sapiens 


Membrane-bound protein 
PR01344 , 


1909 


100 


1317 


AF1S3127 


Gallus 
gallus 


SAPK interacting protein 


2442 


89 


1318 


AF153127 


Gallus 


SAPK interacting protein 


1477 


83 


1319 


AFi53127 


Gallus 


SAPK interacting protein 


1651 


86 


1320 


XS6932 


Homo sapiens 


23 kD highly basic protein 


1044 


100 


1321 


AI<'174605 


Homo 
sapiens J 
>Y830a6 
y830a6 09- 
MAR-2000 28- 
AUG-1998 F- 
box protein 
FBP-18. 
THomo 
sapiens 


F-box protein FjDx25 j 




70 


1322 


MS1732 


Trypanosoma 
cruzi 


neuraminidase 


2i4 


24 


1323 


yi7013 


porcxne 
endogenous 


pol 


304 


64 
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TABLE 2 
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SEQ 
ID 
NO: 


ACCESS ION 
NUMB EH? 






SMITH - 
WATERMAN 
SCORE 


IDENTITY 






retrovirus 








1324 


AL138655 


thaliana 


i« ive pjToc cin 


1174 


37 


1325 


AL138655 


thaliana 


iJULctuive proc-sin 


946 


3b 


1326 


AL133215 


Homo sapiens 


bAl08I*7,2 (novel protein 
similar to rat tricarboxylate 


1322 


99 


1327 


AF161541 


Homo sapiens 


HSPC056 ■ ' 


1357 


99 ' 


1328 


Y73346 


Homo sapiens 


HTRM clone 619699 protein 
sequence. 


785 


96 


1329 


L10910 


Homo sapi.ens 


splicing- factor 


912 


82 


1330 


AF146568 


Homo sapiens 


MILI protein 


1936 


100 


1331 
1332 


W87772 


Homo sapiens 


Human serum glucocorticoid- 
regulated kinase (H-SGK2) 
polypeptide. 


232 


39 


Y41741 


Homo 
sapiens 


Human PRO? 04 protein 
sequence . 


1860 


100 


1333 


AF29S096 


Homo sapiens 


zinc -finger prorein ZBRKl 


411 


91 


1334 


282271 


caenorhabdit 
is elegans 


Similarity to Mouse kinenein- 
like protein KI?4 comes from 
this gene 


578 


44 


1335 


AE000810 


Me thanobacte 
rium 

thermoautotr 
ophicuitt 


conserved protein 


290 


43 


1336 


Y68779 


Homo sapiens 


Amino acid sequence of a 
human phosphorylation 
effector PHSP-ll. 


1019 


91 


1337 


AB027003 


Mus muscuXus 


protein phosphatase 


378 


84 


133 8 


U64 856 


Caenorhabdi t 
is elegans 


weak similarity to TPR 
domains 


215 


40 


1339 


AE001394 


Plastnodium 
f al c Ipaxuint 


protein of the YMR7 family 


170 


29 


1340 


X767a7 


Komo sapiens 


MT-ll protein 


204 


89 H. 


1341 




Arab idop sis 


putative mutX protein; 6 83 98- 
67881 


289 


45 


1342 


AJ276171 


Homo sapiens 


ASPIC 


2122 


100 




AF187016 


Homo sapiens 


myosin regulatory light chain 
xnteracting protein MIR 


2303 


99 


1344 


ACa06963 


Homo sapiens 


similar to Kelch proteins; 
sxmxxar to oiAA/ / 
(PIDrg4650844) 


894 


■ 35 


1345 


AF2S74^^ 


Homo sapiens 




1880 


99 


1346 


725896 


Homo sapiens 


Human rir-r^t- f» 'i n 

fragment encoded from gene 
64 . 


1146 


100 


1347 


AJ272073 


Torpedo 
tnarmorata 


tnale steriixty protein 2 -like 
protein 


1664 


58 


1348 


AF161548 


Homo sapiens 


HSPC063 


lOlS 


96 


1349 


W78128 


Homo sapiens 


Human secreted protein 
encoded by gene 3 clone 
HOSBr96 - 


1117 


100 


1351 


G02144 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6225. 


418 


100 


1352 


D90869 


Escherichia 

CQli 


similar to 


2047 


100 


1353 


A12029 


Homo sapiens 


MRp.14 


613 


100 


13S4 


AC00S328 


Homo sapiens 


R26660_l, partial CDS 


670 


74 


1355 


AC024876 


Caenorhabdit 
is elegans 


contains similarity to 
SW:RPB1_CRIGR 


829 


61 


1356 


AF077226 


Momo sapiens 


Copine III 


1876 


64 


1359 


AF217188 


MUS musculus 


YIPIB 


801 


€3 


1360 


AC074331 


Homo sapiens 


2NF234 


3869 


100 ^ 


1361 


AL163279 


Komo sapiens 


Homolog to cAMP response 


5035 


99 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 








element binding and beta 
transducin family proteins 






X362 


Z48475 


Homo sapiens 


glucoJcinase regulator 


3160 


99 


1363 


248475 


Homo sapiens 


glucokinase regulator 


2682 


97 


1364 


-AF19^764 


Homo sapiens 


megakaryocyte-enhanced gene 
transcript 1 protein; MEGTl 
protein 


205S 


99 , . 


13 65 


AF116609 


Homo sapiens 


PRO0915 


581 


100 


1366 


AF116609 


Homo sapiens 


FR0091S 


S81 


100>^ 


1367 


AL1173S2 


Homo sapiens 


cij876B10.3 (novel protein 
similar to C, elegans 
T19B10.6 (Tr:Q225S7)) 


2581 


99 , 


1368 


Y34124 


Homo 
sapiens 


Human potassium channel 
K+HnovlS . 


1342 


100 


1369 


AJ24 5621 


Homo sapiens 


CTL2 protein 


3728 


99 


1370 


AF008220 


Bacillus 
subtilis 


YtaG 


429 


45 


1371 


X05S62 


Homo sapiens 


alpha- 2 chain precursor lAA - 
2S to 1018) (3416 is 2nd base 
in codon ) 


5908 


99 


13 72 


Z98048 


Homo sapiens 


dJ408N23.4 (novel DnaJ domain 
probe in) 


1296 


99 


13 73 


AF154415 


Homo sapiens 


FLASH 


10253 


100 


13 74 


U20286 


Rattus 
norvegicus 


lamina associated polypeptide 
IC 


1567 


69 


13 75 


US3445 


Homo sapiens 


DQCl 


1645 


46 


1376 


AL117337 


Homo 
sapiens 


bA393J16.1 (zinc finger 
protein 33a (KOX 31)) 


2S0 




13 77 


AC005328 


Homo sapiens 


R266€0_l, partial CDS 


1126 


100 


1378 


U3S113 


Homo sapiens 


metastasis -associated gene 


1823 


69 


1379 


1.15313 


Caenorhabdl t 
is elegrans 


putative 


858 


58 


1380 


Y257S6 


Homo sapiens 


Human secreted protein 
encoded from gene 46. 


1508 


100 


13 81 


AB037360 


Homo sapiens 


ANKHZN 


5734 


95 r 


13 82 


AB037360 


Homo sapiens 


ANKHZN 


959 


97 


1383 


AF237676 


MxiB musculue 


G beta- like protein GBh 


1721 


96 


13 84 


AF237676 


Mus musculus 


G beta- like protein GBL 


1043 


70 


13 85 


Y58793 


Homo sapiens 


Human calcium regulatory 
protein CaRt«5-i . 


715 


100 


13 66 


AF212162 


Homo sapiens 


nxnean 


10369 


99 


13 87 


AL0316B5 


Homo sapiens 


dJ963K23.2 (novel protein) 


337 


33 


13 88 


AC004 890 


Homo sapiens 


similar to zinc finger 
proteins; similar to BAA24380 
>W06316 W06316 03-OCT-1996 
27-APR-199S TRP-1 protein. 


542 


86 


1389 


AF187989 


Homo sapiens 


zinc finger protein ZNF223 


2S€S 


99 


1390 - 


AC0351S0 


Homo sapiens 


Zinc finger protein ZNF221 


3459 


100 


1391 


AF287894 


Homo sapiens 


PIST 


1410 


97 


13 92 


AF282265 


Homo sapiens 


inner centromere protein 
IKCENP 


1794 


99 


13 93 


X90840 


Homo sapiens 


axonal transporter of 
synaptic vesicles 


4584 


99 


1394 


AF076249 


Homo sapiens 


zinc finger protein SBBI21 


3208 


99 


1395 


G02224 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 630S. 


299 


75 


1396 


AC004809 


Arabidopsis 
thaliana 


Similar to 


130 


34 


1398 


AF242519 


Homo sapiens 


zinc finger protein SBZF3 


181 


65 


1399 


AL133396 


Homo 
sapiens 


dJl068H6.4 (prion protein 
lilce protein doppel) 


962 


IDO 


1400 


Y48611 


Homo sapiens 


Human breast tumour- 
associated protein 72 . 


817 


99 


1401 


AC0044 72 


Homo sapiens 


PI. 11659 5 


280 


54 


14 02 


X91489 


Saccharomyce 
3 cerevisiae 


putative HMG box 


164 


27 ^ 



179 



A 



BNSDOCIO: <WO 01S3312A1 I > 



wo 01/53312 



TABLE 2 



PCT/l)S00/34263 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


X403 

1404 
1405 


Y79222 

X8iose 

AB012084 


sapiens 

Ml t c: mt 1 c r*t 1 1 1 1 o 

Mus musculus 


Human transferase TRNSPS-14 . 
ITM 


2842 
1010 


100 
99 


1406 
1407 


AB030251 
AJ010S8S 


Homo Scipi.6iis 
Rh t tius 
rattus 


GTPase activating protein 
PTB-like protexn 


194 

3233 

2684 


29 

99 ■ 
99 


1406 


X75760 


D iro s op i. 1 3 


LRR47 


364 


29 ; 


1409 


U76618 


Mus musculus 


N-RAP 


804 


48 


1410 


U U 3 D / O 


Homo sapiens 


P20e87_l, partial CDS 


835 


63 


1411 


AB000284 


Escherichia 
coli 


orf, hypothetical protein- 


360 


100 


1412 


X01S63 


Escherichia 
coli 


L5 (rplE) (aa 1-179) 


911 


100 


1413 


W78279 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 33. 


1264 


99 


1414 


AB031051 


Homo sapiens 


organic anion transporter 
OATP-E 


3832 


100 


1415 


M17466 


Homo sapiens 


coagulation factor XII 


3455 


ioo 


1416 


AF097994 


Homo 
sapiens 


L -kynurenine/ alpha - 
aminoadipate aminotransferase 


2202 


99 


1417 


AF151077 


Homo sapiens 


HSPC243 


1262 


99 


1418 


Y0994S 


Rattus 
norvegicus 


putative integral membrane 
transport protein 


1098 


61 


1419 


U13152 


Mesocricetus 
auratus 


guanine nueleotide-binding 
protein beta 5 


2179 


76 


1420 


AI4162458 


Homo sapiens 


bA46SL10.5 (KIAA1176 < novel 
protein, presumed ortholog 
of mouse K-Cl cotransporter 
KCC2) ) 


5696 


loo 


1421 




y99426 


Homo sapiens 


Human PRO1604 (tJNQ78S) amino 
acid sequence SEQ ID NO; 3 08. 


1S2 


29 


1422 


Y94923 


Homo sapiens 


Human secreted protein clone 
qsi4_3 protein sequence SEQ 
ID NO : 52 . > ^ 


4039 


99 


1423 


AF177388 


Homo 

^ cijLJ X tt: li O 


cancer-ampl i f ied 
transcriptional coactivator 
ASC- 2 


10748 


99 


1424 


y4 8S17 


Homo sapiens 


Human breast tumour- 
associated protein 62 , - * 


1851 


99 


1425 


AF208848 


Homo aan3f»TtQ 


Dfl — U V u 


1454 


89 


1426 


AF20S84 8 


Homo sapiens 


BM-006 


853 


79 


1427 


AF1128S6 




differentiation enhancincf 
factor 1 


4693 


95 


1428 


041387 


Homo saoii f*ti<5 




1372 


63 


1429 


AF161534 


Homo sapiens 




2853 


78 


1430 


AFX25043 


Mus musculus 


bisphosphate 3 ' -nucleotidase 


275 


30 


1431 


Y66718 


sapiens 


Membrane- bound protein 
PRO1106 


1866 


100 


1432 


AF193613 


Homo sapiens 


Caspr2 


568 


100 


1433 


Afi044560 


Mus musculus 


Gliacolxn 


192 


34 


1434 


R99300 


Homo sapiens 


NTII-1 nerve protein, 
facilitates regeneration of 
nerve cells . 


707 


51 


143S 


AF22053 0 


Homo sapiens 


myo- inositol 1 -phosphate 
synthase Al 


2904 


ibo 


1436 


X70944 


Homo sapiens 


PTB-associated splicing 
factor 


1261 


72 


1437 


AF2 71732 


Homo sapiens 


bridging integratorr3 


1282 


100 


143 B 


Y30811 


Homo sapiens 


Human secreted protein 
encoded from gene 1. 


595 


98 


1439 


AJ293659 


Homo sapiens 


nmcolipidin 


628 


97 


1440 


AF219138 


Homo sapiens 


UGA3 long isoform 


3083 


100 i 


1441 


AF219138 


Homo sapiens 


QGA3 long isoform 


3346 


100 
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SEQ 
ID 
NO: 
1442 


ACCESSION 
NUMBER 

AS039669 


S PHCX£IS 

Homo sstpicsns 


DESCRjlPTION 

- 

AiiKX3 


SMtTH- 
WATERMAN 
SCORE 


IDENTITY 


1443 


AF237711 


ttie 1 3inoga s t e r 


Dxabio 


1944 
291 


100 
27 


1444 


AJ011096 


Homo sapiens 


Nafl beta protein 


43 9 


39 


1445 


X73874 


Homo sapiens 


phosphorylase kinase 


6233 


98 


1446 


AF214114 




breast carcinoma- associated 
antigen BCAA 


3999 


99 


1447 


7VFQ03924 






2645 


99 


1448 


AF003136 


Caenorhabdic 
io elegans 


contains weak similarity to 
an AMP-binding motif 


2843 


52 > 


1449 


AFi55112 


Homo sapiens 


NY-REN-SO antigen 


1184 


89 


1450 


y95004 


Homo sapiens 


Human secreted protein 
vc54_l, SEQ ID NO; 46. 


S8S 


100 


1451 


AF107203 


Homo sapiens 


ataxin 2''binding protein 


688 


57 


1452 


AF107203 


Homo sapiens 


ataxin 2-bindtng protein 


456 


78 


1453 


Z3 8011 


Mus mu&culus 


DMR-N9 


882 


56 


1454 


X90568 


Homo sapiens 


Protein sequence and 
annotation available soon via 
LABEITSEMBL-Heidelberg . DE 


510 


28 


1455 


AL035409 


Homo sapiens 


dJ564Mll-3 (similar to 
sialyltranf erase ) 


1356 


100 


14 56 
1458 


D44480 
AF141326 


Mus musculus 
Homo sapiens 


MATH- 2 protein 

RNA belicase HDB/D C£l 


272 


100 


1459 


AF242552 


Gailus 
gallus 


retinovin 


478 
945 


45 
34 


1460 


U11036 


Homo sapiens 


Ibdl 


724 


84 


1461 


AB02^2^8 ■ 


MUS musculus 


granuphiiin-a 


545 


39 


1462 


Y08134 


Homo sapiens 


acid sphingomyelinase -like 
phosphodie s t erase 


2428 


99 


1463 


AC004997 


Homo sapiens 


match to ESTs E43979 
(NID:gS73097J , R1369S 
(NID:g774333) 


869 


98 


1464 


AC004997 


Homo sapiens 


match to ESTs 243979 
(NIDigS73097) , R19699 
(NID:g774333) 


869 


98 


1465 


U32743 


Haemophilus 

influenzae 

Rd 


^ucose operon protein <l:ucu| 


315 


50 












1466 


Y09022 


Homo sapiens 


NotS6-like protein 


2342 


100 


1467 




Homo sapiens 




Homolog of rat kidney- 
specific (KS) gene 


1072 


99 


1468 


AF071 ^44 


spinacia 
oleracea 
( 


ribulose-1, S-bisphosphate 
carboxylase/oxygenase small 
subunit Nf-methyl transferase t 


333 


2^ 


1469 


yS7930 


Homo sapiens 


Human transmembrane protein 

HTMPM- tiA 


1053 


100 


1470 


AF032666 


Rabtus 
norvegicus 


rsecS 


4504 


93 


1471 


Y70467 


Homo sapiens 


Human membrane chcu:nel 
procein-iv \MECHP-jL7) . 


452 


74 


1472 


AI.031033 


Homo sapiens 


\*A£,xud, , A tRitJosomal I^arge 
i^uL^unic jr seuaour iQine 
Synthase protein) 


1694 


100 


1473 


AF177292 


Homo sapiens 


genethonin 3 


4026 


98 


1474 


S45936 


Homo sapiens 


HTSl 


1101 


50 


1475 


Y8^241 


Homo sapiens 


Human secreted protein 
HOABR60, SEQ ID NO: 156. 


1879 


98 


1476 


AJ010317 


Fugu 
rubripes 


Sand 


1278 


68 


1477 " 


U42831 


Caenorhabdit 
is elegans 


cocied.for by C. elegans cDNA 
yk99b4.3; similar to human 
transforming protein 
(PIR:S22157) 


846 


44 


1478 


X62447 


Homo sapiens 


PR 264 


543 


61 


1479 


X82209 


Homo sapiens 


MNl 


7116 


100 


148Q 


U1053G 


fan paniscus 


WHC. class I A 


675 


84 i 



181 



A 



wo 01/53312 



TABLE 2 



PCT/USOO/34263 



SEQ 
ID 
NO : 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


1481 


AL078S99 


Homo sapiens 


dJ9SlC6.1 (novel protein 
similar to C. elegans 
FS5A12,9 (Tr:P91086)) 


1274 


65 


1482 


298977 


Schizosaccha 

romyces 

potnbe 


putative vacuolar protein 


256 


29 


1483 


AB005562 


Mus musculus 


JUK/SAPK-associated protein-1 


4968 


92 


1484 


AIi050120 


Homo sapiens 


hypothetical protein 


716 


100 ; 


1485 


M27678 


Homo sapiens 


DNA binding protein 


1006 


53 


1486 


y69161 


Homo sapiens 


Amxno acid sequence ot a 
partial protein kinase. 


"575 


99 


1487 


X84156 


Saccharomyce 
B cerevisiae 


ATHl 


341 


'29 


1488 


AF038963 


Homo sapiens 


RNA helicase 


446 


34 


1489 


U56966 


CaenorhabdiC 
is elegans 


coded for by C. elegans cDNA 
yk3Db3.S; coded for by C. 
elegans cDNA yk30b3.3 


620 


42 


1490 


AE000989 


Archaeoglobu 
3 fulgidus 


enoyl-CoA hydratase (fad-4) 


533 


46 


1491 


M80633 


Rattus 
norveglcus 


adenylyl cyclase type IV 


707 


95 


1492 


Y73342 


Homo sapiens 


HTRM clone 27090S5 protein 
sequence . 


'3&13 


99 


1493 


Y17220 


Homo sapiens 


Human secreted protein (clone 
f j283-ll) . 


462 


■37 


1494 


AF133670 


Mus musculus 


ARL-6 interacting protein-2 


701 


97 


1495 


Y94897 


Homo 
sapiens 


Human protein clone HP10574. 


1371 


100 


1496 


ALa49699 


Homo saplene 


dJ747H23.2 (novel protein) 


1550 


100 


1497 


AF037447 


Homo sapiens 


ribosoraal S6 protein kinase 


2427 


100 


1498 


AIj445067 


Thermoplasma 
acidophilum 


putative target YPIj207w of 
the HAP 2 transcriptional 
complex related protein 


269 


35 


14 99 


AB039947 


Homo sapiens 


XI 1I»- binding protein 51 


227 


3 6 


ISOO 


AiJ2777S0 


Homo sapiens 


UBASH3A protein 


3509 


100 


1501 


AIi050333 


Homo 
sapiens 


dtJ93K22 . 1 (novel- protein 
(contains PKF2P564B116) ) 


2439 


100 


1502 


AF179895 


Homo sapiens 


TALE horoeobox protein Meis2b 


1140 


100 


1503 


AF178948 


Homo sapiens 


TALE homeobox protein Meie2a 


1177 


100 


1504 


Y53005 

■ 


Homo sapiens 


Human secretetd protein clone 
pn749_8 protein sequence "SEQ 
ID NO: 16. 


1442 


99 


1505 


XB2494 


Homo sapiens 


f ibulin-2 


35BQ 


99 


1506 


X98296 


Homo sapiens 


ubiquitin hydrolase 


783 


42 


XS07 




Homo sapiens 


dJ1103G7.6 (novei protein) 


1098 


100 


1508 


Y76144 


Homo sapiens 


Human secreted protein 
encoded by gene 21. 


1736 


100 


1509 


AF2201S2 


Homo sapiens 


uncharacterized hypothalamus 
protein HT008 


1181 


98 


1510 


U64^01 


Caenorhabdit 
is elegans 


Gene probably begins in the 
next cosraid 


415 


58 


1511 


AliS 56192 


Neurospora 
crassa 


related to mdmi nroteln 




29 


1512 


D17629 


Homo 
sapiens 


N-acetylgalactosamine 6- 
sulfate sulfatase (GALNS) 


1829 


100 


1513 


AF16a717 


Homo sapiens 


X 009 protein 


694 


99 


1514 


AJ243531 


Homo sapiens 


nMlS protein 


735 


100 


1515 


AC003672 


Arabidopsis 
thallsina 


putative C3HC4-type RING zinc 
finger protein 


407 


30 


1516 


AF115435 


Rattus 
norveglcus 


syntaxin 17 


1374 


90 


1517 


AF003140 


caenorhabdit 
is elegans 


C44E4.5 gene product 


274 


31 


1518 


AB0025S4 


Rattus 
norveglcus 


Beta - alanine -pyruvate 
aminotransferase 


2238 


82 


1519 


AX>121764 


Schxzosaccha 


yeast atpl2 protein precursor 


270 


30 



182 



A 



wo 01/53312 



TABLE 2 



PCT/USOO/34263 



SEQ 
XD 
NO : 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


IDENTITY 






romyces 


homo log 






1520 




Homo 
sapiens 


vasculair endothelial 
junction-associated molecule 


547 


aoo 


1S21 


D3 1764 


Homo sap JL 6X15 


MAft.uUD4 


170 


27 


1522 


X D 0 D ^ 4 


Homo 
sapxGns 


Membrane "-bound pirotein 


985 


100 


1523 


Y94450 


Homo sapiens 


Human inflammation associated 
protein 


250 


43 


1524 


AVj U 0 O 1 O 7 


Arabidopsls 
thaliana 


F17F8 , 22 


277 


37 


1525 


AF109377 


Mus musculus 


IdlBp 


1277 


83 


1526 


AL031427 


Homo sapiens 


dJ167A19.4 (novel protein) 


1432 


99 


1S27 


Y0813S 


Mus musculus 


acid sphingomyelinase- like 
phosphodiesterase 


1496 


79 


1528 


AKQ24423 


Homo sapiens 


FIa/00012 protein 


611 


100 


1529 


AF154502 


Homo sapiens 


quiescent cell proline 
dipeptidaoe 


679 


100 


1530 


AF20559e 


Homo sapiens 


transposase-like protein 


1368 


100 


1531 


AF2S1033 


Homo sapiens 


putative zinc finger protein 


1420 


50 


1532 


W748Q5 


Homo sapiens 


Human secreted protein 
encoded by gene 77 clone 
HOEAS24 . 


493 


57 


1533 


AF039023 


Homo sapiens 


Ran-GTP binding protein; 
RanBPe 


S707 


99 


1534 


AC007190 


Arabidopsls 
thaliana 


r23N19.9 


:i74 


37 


1535 


AB027564 


Homo sapiens 


DINBl 


4482 


ICQ 


is3e 


y36178 


Homo sapiens 


Human secreted protein 


377 


87 


1S37 


y50907 


Homo sapiens 


Human f^etal brain cVUA clone 
vb3_l derived protein. 


3693 


39 


lS3fi 


AF017368 


Mus mus cuius 


faciogenitai dysplasia 
protein 2 


177 


47 


1539 


AF2€6756 


Homo sapiens 


sphingooine kinase 


2011 


99 


1540 


Z46804 


Homo sapiens 


OAl 


2238 




1541 


AF000195 


Caenorhabdi t 
is eXegans 


Contains similarity to Pfam 
domain: PF00169 ^PH) ^ 
Seore«20.6, E-value-1 . 9e-05, 
N=l 


379 


45 


1542 


Y711^9 


Homo sapiens 


Human phosphodiesterase 
interacting protein, 
myomegalin. 


9415 


99 


1543 


X76092 


Homo sapiens 


DNA binding protein RFX3 


3327 


100 


1544 


AB015330 


Homo sapiens 


HRIHFB20O7 


631 


50 




AT d.2tO*m C» / 


Hotno sapiens 


transcription factor LBP-lb 


2622 


100 


1546 




Caenoirbabcli t 
is elegans 


Similar to BZIP transcription 
factor 


518 


42 


1547 


X55885 


Hotno sapiens 


KDEIj receptor 


1106 


100 


154 8 


AB035495 


Carassius 


ubiguitin-activating enzyme 
HI 


836 


42 


154 9 


AL0217O7 


Homo sapiens 


dJ508H5.4 (KIAA0668) 


3688 


100 


155 0 


./lu if to 


t3acj.xxus 
subtilis 


YvtjK protein 


292 


42 


1551 


AF145615 


urosophila 
rae 1 anoga a t e r 


BcDMA,GH03377 


822 


44 


1552 


JU,157734 


Schizosaccha 

romyces 

pombe 


putative mannosyl transferase 
involved in N-glycosylation 


435 


37 


1553 


AF079S27 


Mus musculus 


XER5 


691 


63 


1554 


Aa026291 


Rattue 
norvegicus 


acetoacetyl-CoA synthetase 


1099 


88 


1555 


y44722 


Homo sapiens 


Human immune system molecule, 

ISMO-3. 


1780 


99 


1556 


AF116553 


Drosophila 
tnelanogaster 


antennal-specif ic short- chain 
dehydrogenase/reductase 


277 


32 


1557 


y71056 


Homo sapiens 


Human membrane transport 


1375 


99 1 



183 



A 
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TABLE 2 



PCT/USOO/34263 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 








protein, MTRP-l. 






1558 


Y71056 


Homo sapiens 


Human membrane transport 
protein, MTRP-l. 


1975 


99 


1559 


Y71056 


Homo sapiens 


Human membrane transport 
protein, MTRP-l . 


1894 


97 


1560 


AF092050 


Mus musculus 


beta-l,3-N- 

acetylglucosaminyl transferase 


262 


44 


1561 


AI.109827 


Homo sapiens 


da309K20.2 (acrosomal protein 
ACRSS (similar to rat sperm 
antigen 4 (SPAG4))) 


1607 




1562 


AJ131890 


Homo sapiens 


DNA polymerase lambda 


3002 


100 


1563 


A1.035424 


Homo sapiens 


dA22Dl2,l (novel protein 
similar to Drosophila Kelch 
proteins) 


301S 


100 


1564 


AC002400 


Homo sapiens 


Gene product with similarity 
to Ubiqaitln binding enzyme 


2790 


100 


1565 


AGO 053 06 


Homo sapiens 


R27216_l 


919 


82 


1566 


AF000195 


Caenorhabdit 
is elegans 


Contains similarity to Pfam 
domain: PFD0169 (PH) , 
Score=:20.6, E-value-1 , 9e-0S» 
N=l 


550 


45 


15^7 


AB033281 


Homo 
sapiens 


F-box and WD- repeats protein 
beta-TRCP2 isoCorm C 


2879 


100 


15S8 


D49473 


Mus mus cuius 


truncated form of Soxl7 


1047 


78 


15S9 


AK02527d 


Homo sapiens 


unnamed protein product 


210 


91 


1570 


X75756 


Homo sapiens 


protein kinase C rou 


4797 


99 


1571 


AF145713 


Homo sapiens 


SCHIP-l 


2^d8 


100 


X572 


AE003831 


Drosophila 
melanogas ker 


CG18445 gene product 


180 


31 


1S73 


AF074603 


S t rep t omyce s 
griseus 
subsp , 
griseus 


NonF 


205 


38 


1S74 


U28993 


CaenorhaJ:>di t 
is elegans 


F22D3.3 gene product 


144 


27 


1575 


API 2 950 7 


Homo sapiens 


transcription factor ICBP90 


287 


68 


1576 


X64878 


Homo sapiens 


oxytocin receptor 


2002 


100 


1577 


AF237711 


Drosophila 
cnelanogaster 


Diablo 


421 


54 


1578 


G00975 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 50S6. 


480 


100 


1579 


AF24 8744 


Cryptosporid 
ium parvum 


thrombospondln-reXated 
adhesive protein 


123 


33 


1S8 0 




Homo sapiens 


ol»J58SI14 . 2 (novel protein 
(translation of cDNA 


663 


100 


1581 


AF041853 ■ 


Homo sapiens 


kinesin family member protein 
KXF3A 


345 


33 


1582 


AF025441 




wpa" xnuer^cKJu j.iig pjroce xn \JXir^ 


H9S 


100 


1583 


AE001803 


maritiwa 


gxyweirtue Kxnase/ pucacxve 




34 


1584 


AF252283 


Homo sapiens 


Kelch-like 1 protein 


3973 


100 


1585 


AF169675 


Homo 
sapiens 


leucine- rich repeat 
transmembrane protein PLRTl 


3494 


99 


158S 


AF118274 


Homo sapiens 


DNb-5 


2626 


97 


1587 


X79440 


Homo sapiens 


NADP+-dependent malic enzyme 


3167 


99 


1588 


X99802 


Homo sapiens 


2YG homologue 


3966 


99 


1589 


AF169803 


Homo sapiens 


f lavohemoprotein bS+bSR 


2563 


100 


1590 


Y29861 


Homo sapiens 


Human secreted protein clone 
cb98 4. 


181 


47 


1591 


225535 


Homo sapiens 


nuclear pore complex protein 
hnuplS3 


7567 


99 


1592 


X13293 


Homo sapiens 


B-myb protein (AA 1-700) 


3678 


99 


1593 


M74027 


Homo sapiens 


mucin 


242 


27 


1594 

i 


AL139314 


Schisosaccha 
rcnyces 


hypothetical protein 


235 


■54 f 



184 
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PCT/USOO/34263 



TABLE 2 



SEQ 
ID 
NO: 


ACCESS ION 
NUMBER 


SPECIES 




SMITH- 
WATERMAN 
SCORE 


IDENTITY 






pombe 








1S95 


W7e324 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 81. 




318 


9 8 


1596 


Y94906 


Homo sapiens 


Human secreted prone in clone 
rb6 49 3 protei n sequence SEQ 
ID NO: 18. 


2236 


98 


1597 


AF174605 


Homo sapiens 


F-box protein Fbx2S 


1408 


99 


1598 


AB0322S4 


Homo 
sapiens 


bromodomciin adjacent to zinc 
finger domain 2A 


9676 


98 


1599 


X73114 


Homo sapiens 


slow MyBP-C 


S!56 8 


95 


1600 


X82200 


Hon^o sapiens 


- CTioStaiFso ~~ — 


2305 


100 


1601 


Y00876 


sapiens 


sequence * 


1149 


98 


1602 


AJ223351 


Homo sapiens 


"Xiw*" Ain_t=i at. u iiiy ]proc6xn j 


2821 


99 


1503 


AJ222801 


Homo sapiens 


iwii^iax s^jpnxii^ontyei, xnase 


2268 


99 


1604 


AJ222801 


Homo sapiens 


neutral sphingomyel inaae 


1601 


99 


160S 


AFi85S76 


Mu.s mus cuius 


POZ/zinc finger transcription 
factor ODA- 8 


3435 


97 


1606 


AF093744 




unknown 


131 


100 


1607 


A12142 


synthetic 
construct 


IFN-pseudo-omega 2 


800 


98 


1608 


yS7949 


Homo sapxens 


Human transmembrane protein 


1868 


100 


1609 


AF151044 


Homo sapiens 


HSPC210 


681 


^7 


1610 


X15218 


Homo sapiens 


SKI protein (AA 1 - 728) 


376S 


100 


1611 


Y08200 


Homo sapiens 
— , 


rafa geranylgeranyl 
transferase 


2976 


100 


1612 


AF220560 


Hotno sapisns 


B/K protein 


2486 


99 


1613 


X* V* \J \J *T ■» O JL. 


^xTdo J. uop S JL s 


nodulxn-like protein 


371 


26 


1614 


Y09501 


Homo sapiens 


NADH-cytochrome-b5 reductase 


1607 


100 


ieife 


yi5S21 


Homo sapxens 


start posi.'tion 1 


3150 


97 


1616 


AJ010750 


Rattus 
norvegicus 


Castration induced prostatic 
apoptosis related protein-l. 


890 


62 

t. 


1617 


X58079 


Homo sapiens 


SlOO alpha protein 


4 81 


100 '■^ 


1618 




sapiens 


M emb r an c** bound protein 

irrsxy^ u V/ ^ 4 


967 


100 


1619 


AJ242973 


Homo sapiens 


peptide methionine sulfoxide 
reductase 


929 


100 


1620 


AF150733 


Homo sapiens 




288 


100 


1621 


AJQ07509 


Homo sapiens 




4646 


98 


1622 


X64177 


Homo sapxens 


metallothionexn ■ 


380 


100 


1623 


AE001045 


Archaeoglobu 
s fulgidus 


A. fulgidus predicted coding 
region AF0859 


240 


36 


1624 


AL355013 


Schi20saccha 

romyces 

pombe 


mitocjhondrial carrier protein 


4 03 


34 


1625 


Y66746 


Homo 
sapiens 


Membrane -bound protein 
PR01198 . 


1184 


100 


1626 


D9Q053 


Sus scrofa 


destrin 


863 


100 


1627 


Y3^9i4 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO, 
203. 


756 


100 


1628 


AL031775 


Homo sapiens 


dJ30M3.2 (novel protein) 


470 


100 


1629 


AF132484 " " 


Mus musculus 


unknown 


286 


68 


1630 


AP017096 


Drosophila 
melanogaster 


similar to C. eXegans 
R10H10.6 and cerevisiae 
YD8419.03C 


493 


61 


1631 


X63077 


Homo sapiens 


lactate dehydrogenase -A 


1704 


100 


1632 


AF151084 ■" 


Homo sapiens 


HSPC2S0 


763 


100 


1633 " 


Aj001874 


Homo sapiens 


orf 


255 


97 


1634 


AC012187 


Arabidcpsis 
thaliana 


Contains weak similarity to 
GATA-6 DNA- binding protein 
gb|H36135, gb|Z26200 come 
from this gene. j 


143 


38 



185 
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TABLE 2 



PCT/USOO/34263 



GEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATKRMAN 
SCORE 


IDENTITY 


1635 


Ai=*026 24 6 


Homo sapiens 


HERV-E integrase 


411 


90 


1636 


yS0943 


Homo sapiens 


Human adult brain cDNA clone 
ve8_l derived protein* 


1126 


95 


1637 


AF134593 


Homo sapiens 


L-pipecoli.c acid oxidase 


2068 


99 


1638 


AJ238247 


Mus musculus 


putative phosphatase subiinic 


1948* 


96 ^ 


1639 


Y94542 


Homo sapiens 


Human secreted protein clone 
yk251_l protein sequence SEQ 
ID NO:£)0. 


1320 


100 

afc . 


1640 


AF235030 


Homo sapiens 


BMas antigen 


766 


99 


1641 


AF233288 


Drosophila 
tnelanogaster 


WDS ■ 


358 


26 


1642 


M19351 


Mus muscuius 


immunoglobulin heavy chain 
binding protein 


14 5 


34 


1643 


y704S2 


Homo sapiens 


Human membrane channel 
protein-2 (MECHP-2) , 


1352 


100 


1644 


AF176520 


Mus muaculus 


WD repeat-containing F-box 
protein FBW5 


2676 


88 


1645 


W67816 


Homo sapiens 


Human secreted protein 
encoded by gene 10 clone 
HCEMa42 . 


1156 


100 


1646 


X67155 


Homo sapiens 


mitotic kinaae-like protein-1 


4456 


99 


1647 


M63180 


Homo sapiens 


threonyl-tRKA synthetase 


1040 


61 


1648 


y87342 


Homo sapiens 


Human signal peptide 
containing protein HSPP-119 
SEQ ID NO: 119. 


1566 


93 


1649 


R95332 


Homo sapiens 


Tumor necrosis factor 
receptor 1 death domain 
ligand (clone 3TW) . 


4137 


100 


1650 


AC00713ri 


Homo sapiens 


Putative map kinase 
interacting kinase 


856 


99 


16S1 


ABO 1534 6 


Homo sapiens 


EpslSR 


44 64 


99 


1652 


AL161576 


Arai>iciopsis 
thaliana 


putative protein 


1341 


48 


16S3 


AC0OS313 


Arabidopsis 
thaliana 


putative calmodulin 


288 


28 


16S4 


AL.03142 8 


Homo sapiens 


dJlB4J9.X {KIAA0601 protein) 


3526 


100 


1655 


AIi031428 


Homo sapiens 


dJ184J9.1 (KIAA0601 protein) 


3526 


100 


1656 


AB017910 


Dictyosteliu 
m discoideum 


myoM 


297 


32 


1657 


y28919 


Homo 
sapiens 


Human regulatory protein 
HRGP-S. 


2251 


99 


1658 


AF056X91 


Homo sapiens 


TPA inducible protein 


2744 


98 


1659 


U76846 


Arabidopsis 
thaliana 


ubiquitin-specif ic protease 


137 


35 


1660 


AIi078627 


Schizosaccha 

romyces 

pombe 


actin-iiJce protein; (2 act in 
domains) 


320 


34 


1662 


X52022 


Homo sapiens 


collagen type Vi, alpha 3 
chain 


16274 


99 


1663 


AF300648 


Homo 
sapiens 


guanine nucleotide binding 
protein beta siibunit 4 


1811 


100 


1664 


AP214736 


Homo sapiens 


EH domain containing protein 
2 


2774 


100 


1665 


248613 


Saccharomyce 
s cerevisiae 


unknown 


138 


26 


1666 


AF177385 


Homo 
sapiens 


cytochrome c oxidase assembly 
protein iaoform 2 


1395 


99 


1667 


AC007842 


Homo sapiens 


BC331191__1 


1581 


47 


1668 


S67513 


Boma 
disea^te 
virus BDV, 
WT-1, Halle 
Bl/91, horse 
brain, field 
isolate. 
Peptide, 370 


p4U 


397 


43 



186 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 






aa 








1669 


Z99753 


Schizosaccha 

romyces 

pombe 


putative NOLI -K0P2- sun family 
nucleolar protein 


569 


47 


1670 


G03130 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7211. 


427 


97 


1671 


M9662S 


Gal ius 
gallus 


cardiac muscle tens in 


1185 


54 


1672 


AF174482 


Homo sapiens 


polycomb 3 


2005 


99 V 


1673 


YS1846 . 


Homo sapiens 


Human 18,1 homolog protein 
fragment . 


233 


29 , 


1674 


AF25S334 


Homo sapiens 


EXP35 


152 


29 


1575 


Y94367 


Homo 
sapiens 


Human protein clone HP10S63, 


109 


30 


1676 


y2S712 


Homo sapiens 


Human secreted protein 
encoded from gene 2, 


3043 


99 


1677 


T25712 


Homo sapiens 


Human secreted protein 
encoded from gene 2 . 


1580 


91 


1678 


AF1631S1 


Homo sapiens 


dentin sialophosphoprotein 
precursor 


170 


17 


1679 


AF163151 


Homo sapiens 


dentin sialophosphoprotein 
precursor 


17 0 


17 


1680 


AK024453 


Homo sapiens 


FLJ00045 protein 


134 9 


100 


1681 


AF019236 


Dictyosteliu 
m discoideum 


TipD 


613 


34 


1632 


AJ243459 


Leishmania 
major 


pro teophosp hog 1 yean 


153 


26 


1683 


Z69269 


S Chi z OS ac cha 

romyces 

pombe 


putative GTP- binding protein 


660 


46 


1684 


X94910 


Homo aapiene 


ERp28 


1334 


100 


1685 


AF286475 


Ta}cifugu 
rubripes 


retinitis pigmentosa GTPase 
regulator- like protein 


196 


19 


1686 


AF191298 


Homo sapiens 


vacuolar sorting protein 35 


4087 


100 


1687 


Aa275986 


Homo sapiens 


transcription factor 


2958 


100 


1688 


AJ275986 


Homo sapiens 


transcription factor 


1886 


88 


1689 


X07311 


Drosophila 
melanogaster 


heat shock protein 


138 


43 


1690 


AF240463 


Rattus 
norvegicus 


LISl- interacting protein 
NUDE! 


1383 


83 


1691 


AJ272078 


Homo sapiens 


APOBEC-1 stimulating protein 


1256 


68 


1692 


AJ272079 


Homo sapiens 


APOBEC-l stimulating protein 


1336 


60 


1693 


AP177942 


Xenopus 
laevis 


Jcatanin p60 


1664 


^6 


1694 


o J ^ 


komo sapiens 


arginine N-methyltransf erase 


1774 


100 


1695 


JtW ^ ^ D O 27 


sapiens 


protein arginine N- 

methyl transferase l -variant 2 


1182 


81 


1695 


mw/ vj V Xi 17 J 


Homo sapiens 


iinnamed protein product 


1060 


100 


1697 


AB041035 


Homo sapiens 


kidney superoxide-producing 
NAOPH ox}.dase 


3122 


100 


1698 


AB041035 


— . : 

Homo sapiens 


kidney eupsroxide-producing 
NAurri oxxciase 


2181 


100 


1699 


AF025772 


Homo sapiens 


C2H2 zinc finger protein 


488 


54 


1700 


y44676 


Homo sapiens 


Human ARF-Related Protein-1 
CHARP-1) . 


938 


97 


1701 


AK022407 


Homo sapiens 


unnamed protein product 


315 


98 


1702 


AB024574 


Komo sapiens 


GTP- binding like protein 2 


1172 


100 


1703 


AF05S078 


Homo sapiens 


zinc finger protein 42 


421 


52 


1704 


AF198092 


Kus musculus 


RP42 


1057 


77 


170S 


AE003S73 


Drosophila 
melanogaster 


CG12474 gene product 


161 


33 


1706 


AB036345 


Drosophila 
melanogaster 


aquaporin 


164 


24 


1707 


Y5S927 


Homo sapiens 


Human STLK2 protein. 


2146 


100 


1708 


U27121 


Danio rerio 


G12 


212 


47 


1709 


AL391710 


Araoxaopsas | putative protein 


505 


SO T 
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SEQ 
ID 
NO: 


ACCESSION 


SPECIES 

chaliaxia 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


IDENTITY 


17X0 


B01311 


Homo sapiens 


Human PR0241 polypeptide. 


1649 





1711 


U40750 


Mus TnizscuJLvis 


A. 1 11.1- 11 U4.1.UXI|^ pj^Oi-CXrl 


4561 


85 


1712 


AJ011118 




protein 


1490 


09 


1713 


AF255303 


Homo 
sapiens 


membrane-associated nucleic 


4416 


9S 


1714 


AF255303 


Homo 
sapiens 


membrane-associated nucleic 

aL- J.(a uj.iAUX4l^ pj^OC6 Xn 


2960 




1715 


U08227 


Rattus 
noirvecf icus 


Ras- related protein 


511 


51 


1716 


AF168795 


Rattus 


schlafen-4 

- 


1129 


44 


1717 


AF196304 




SUMO- 1- specific protease 


5S04 


99 


1718 


AL355737 


HoiClO S 2ip i 6I1G 


HfVjr^: UA 


1782 


iOO 


1719 


V ^ J J 


H^locynthicL 
roretzi 


HrPET-1 


1069 


46 


1720 


iiP071 ^ 1 7 

V ' *■ J <L / 


MUS FRUSCUlUS 


C0P9 complex subunit 7b 


1297 


97 


1721 


AJ272215 


Homo sapiens 


HEYL protein 


1681 


99 


1722 


G01982 


Homo sapiens 


Human secreted protein, SEQ 
ID NO; 6063 . 


718 


100 


1723 




Caenorhabdit 
is eXecrans 


similar to Dncharacterized 
protein family UPF0034, 


825 


41 


3.724 


G01972 


Homo sapiens 
...... 


Human secreted protein, SEQ 
ID NO : 6 053 . 


58^ 


92 


172S 


Y94 44i 


Homo 
sapiens 


Human Adipose Specific 
Protein 1, 


1231 


100 


1726 


AP2SS44'* 


Homo sapiens 


CGI- 2 01 protein 


4397 


99 


1727 


^ *t ^ Q 


Homo sapiens 


HT004 protein 


1810 


99 


172 8 


D108a4 


Bos taurus 


neurocaicin 


1002 


99 


1729 


'Z18S2'9 


GalXus 
gallus 


tens in 


1411 


84 


1730 


Z73423 


Caenorhabdit 
IS dedans 


cDNA EST EMBIi:Z14 908 comes 
from this gene-cDNA EST this 
gene 


233 


41 


1732 


AF0S0B91 


Homo sapiens 


rpfe5oio5 


470 


30 


1733 


AiJ277724 


Homo sapiens 


hi 3 tone deacetylase 8 


2015 


100 


1734 


G04050 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8131. 


503 


95 


1735 


D45913 


Mus musculus 


leucine -rich-repeat protein 


3531 


94 


1736 


AF036709 


D]tros opb i 1 a 
virilie 


failed axon connections 
protein 


276 


32 


1737 


AF19S120 


Homo sapiens 


dynactin p62 subunit 


2417 


99 


1738 


Iil53l4 


Caenorhabdit 


contains similarity to Pfam 
i.amxi.y Frui,//2 M=l 


206 


37 


1739 


X54618 


Listeria 
inon o cy t o^e n e 
s 


phoaphadidyl inositol specific 
phospholxpase C 


134 


27 


1740 


AL031658 


Homo sapiens 


uu J . 4 ^novei prouein 
similar to predicted c, 

elecrans an r* *irt^'o<ah'Jr»a1 i o 

proteins) 


123 


31 


1741 


y3S924 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO, 
173 . 


1013 


99 


1742 


AC013354 


Arabidopals 
thaliana 


FlbHia.15 


202 


32 


1743 


W 75771 


Homo 
sapiens 


Human GTP binding protein 
APD08. 


1932 


59 


1744 


W75771 


Homo 
sapiens 


Human GTP binding protein 
APD08. 


1854 


61 


1745 


AF221098 


Homo 
sapiens 


Kal guanine nucleotide 
exchange factor RalGPSlA 


1224 


70 


1746 


y99372 


Homo sapiens 


Human PRO1430 (UNQ736) amino 
acid sequence SEQ ID NO: 116. 


1332 


99 


1747 


ly^^yfi 1 Homo sapiens 


Human coenzyme A-utilising 


842 


XOO ] 



m 
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SEQ 
ID 
NO: 


ACCESSION 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCX5RE 


IDENTITY 








enzyme CoAEN" 2 . 






174 8 


AK0244 3 6 




f i-tiJ u u U id D p3rOCftxn 


1619 


100 


1749 


AE000877 


Methanobacte 

thermoautotr 
ophicum 


conserved protein 


231 


36 


1750 


API 013 61 


tnS 1 fLtlOCf «l S 1 6 IT 


Abnormal X segjre^ation 


193 


33 ~ 


1751 


yiS067 


HoiTio sapiens 


ZNF232 " 


889 


100 V 


1752 


AF2d103 8 




wii£'~j.x)te procexn 


822 


100 . 


1753 


AC003093 


Homo sapiens 


OXYSTEROL-JBINDING PROTEIN; 
45% similarity to P220S9 
(PID:gl29308] 


352 


57 


1754 


X69089 


Honio sapi.6ris 


165kD protein 


5703 


99 


1755 


AIj049795 


Homo sapiens 


dJ622L5.3 tnovel protein) 


1039 


100 


1756 


A1»U^1393 


Homo sapiens 


dkJ733Dl5.1 {Zinc-finger 
protein) 


276S 


100 


1757 


AB04 0672 


Homo sapxens 


UDP-GalNAc: polypeptide N- 

acetylgalactosaminyltransfera 

se 


2020 


99 


1758 


AIi022236 


Homo sapiens 


dJ1042Kl0.4 (novel protein) 


776 


43 


1759 


AF117653 


Homo sapiens 


double homeobox protein 


375 


54 


I7fi0 


Y12665 


Homo eapiens 


hNop56 ' " 


2959 


99 


3L761 


AL049712 


Homo sapiens 


dJ686C3.2 (nucleolar protein 
hMop56) 


2595 


99 


1762 


ACQ02394 


Homo 
sapiens 


Gene product with similarity 
to dynein beta eubunit 


1542 


51 


1763 


AF169017 


Homo sapiens 


forxnimi no trans f erase 
cyclodeaminase 


877 


100 


1764 


U91541 


Homo sapiens 


human formiminotransf erase 
cyclodeaminaee (f ted) protein, 
carboxy- terminal end 


-596 


100 


1765 




Bacillus 
halodurans 


'YlqF 


3S0 


34 


1766 


y3842l 


Homo sapiens 


Human secreted protein 
encoded by gene No. 36. 


145 




1767 


AC009176 


thaliana 


putative rlbulose-l, 5- 
bisphosphate 


216 


27 


1768 


AK000647 


Homo sapiens 


unnamed protein product 


737 


99 


1769 


AJ238982 


Homo sapiens 


VNN3 protein 


2665 


99 


1770 


U73522 


Homo sapiens 


AMSH 


1214 


56 


1771 


089435 


Kus musculus 


unknown 




86 


1772 


S70011 


Rattus sp. 


trxcarboxylate carrier 


1604 


95 


1773 


AL035086 


Homo sapiens 


dJ44A20 ? /nrw(*i Tiv*rtt-j» *\ l 


2036 


100 


1774 


Y99426 


Homo sapiens 


Human PRO1604 (UWQ785) amino 
dcid sec[uence SEQ ID NO '3 08 


10S7 




1775 


AF110330 


Homo sapiens 


glutaminase 


^ — 


100 


1776 


AJ269529 


Homo eapiens 


cilvcerol 3 ■^chosDhatp' ti^t* a «<i 


2787 


100 


1777 


281579 


Caenorhabdit 
is elegans 


cDNA EST yk:76fl.5 comes from 
this gene 


232 


31 


1778 


Ay007239 


Homo sapiens 


monooxygenase X 


1875 


99 


1779 


AL109608 


Schizosaccha 

romyces 

pombe 


oxyscerol-binding protein 
family 


644 


38 


173 0 


AF254260 


Homo sapiens 


tuftelin 1 


1729 


100 


1781 


L07924 


Mus musculus 


guanine nucleotide 
dissociation stimulator 


247 


SO 


1782 


AF295773 


Homo 
sapiens 


ral guanine nucleotide 
dissociation stimulator 


142 


49; 


1783 


AK024475 


Homo sapiens 


FL1JOOO68 protein 


4333 


100 


1784 


AK024475 


Homo sapiens 


FLtJOOOee protein 


3996 


93 


178S 


G03933 


Homo sapiens 


Human secreted protein, SEQ 
ID NO; 8014. 


570 


100 


SB2637 


Homo sapiens 


Ig lambda-like gene/beta- 


247 


100 ^ 



189 



BNSDOCID; <WO 0153312At J_> 



wo 01/53312 PCT/USOO/34263 



TABLE 2 



SEQ 
XD 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


IDENTITY 








glucuronidase exon 11 homolog 
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TABLE 3 



SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


2 


BLQ0240 


Receptor tyrosine kinase 
class III proteins. 


BL00240B 24,70 e,250e- 
12 157-181 


3 


PR00109 


"TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109D 17.04 8.085e- ~ 
13 358-381 


4 


BL00028 


Zinc finger, C2H2 type, 
domain proteins . 


BLOO028 16.07 9,4Qbe- 
10 1129-1146 BL00028 
16.07 l,257e-09 820- 
837 


5 


BL00023 


Type II fibronectin 
collagen- binding domain 
proteins . 


BL00023 24.31 8.920e- 
33 413-450 BL00023 
24.31 4.S45e-27 353- 
390 


6 


HL00023 


Type II fibronectin 
collagen- binding domain 
proteins . 


BL0O023 24.31 8.920e- 
33 413-450 BL00023 
24.31 4.S4Se-27 353- 
390 


7 


BL00023 


Type II fibronectin 
collagen- binding domain 
proteins . 


BL00023 24.31 8-920e- 
33 413-450 BL00023 
24.31 4.S4Se-27 353- 
390 


8 


BL00023 


Type II filbronectin 
collagen- binding domain 
proteins . 


BL00023 24.31 e.920e- 
33 413-450 BL00023 
24.31 4-545e-27 353- 
390 


B 


BL01160 


Kinesin light chain 
repeat proteins . 


BL01160B 19.54 S,119e- 
09 863-917 


10 


PR00464 


E-CIiASS P4S0 GROUP II 
SIGNATURE 


PR00464D 17.40 6.1826- 
12 294-312 PR00464G 
12.41 4 .231e-ll 377- 
393 


11 


PR00734 


GliYCOSYL HYDROLASE 
FAMILY 7 SIGNATURE 


PR00734I 11.46 4:'296e- 
09 502-520 


12 


PF00023 


Ank repeat proteins. 


PF00023B 14.20 6.500e- 
10 89-99 PP00023B 
14.20 2.636e-09 56-66 


14 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031B 15.41 3.848e- 
09 79-113 


15 


PR00208 


GLIADIN AKD LMM GLUTENIN 
SUPERFAMILY SIGNATURE 


PR00208A 12.59 9.8686- 
10 517-535 PR00208A 
12.59 2.233e-09 520- 
538 


17 


PD00066 


PROTEIN ZINC-FINGER 
METAL- BINDl . 


PD00066 13.92 8.200e- 
14 282-295 PD00066 
13.92 9,4a0e-14 477- 
490 PD00066 13.92 
€.500e-13 50S-S18 
PD0O066 13.92 9.500e- 
13 254-267 PD00066 
13.92 1.429e-12 393- 
406 PD00066 13*92 
6.S7le-12 421-434 


18 


BL0Oa45 


CAP-Gly domain proteins. 


EL0O845 16.43 2.200e- 
25 55-80 


20 


BL00487 


IMP dehydrogenase / GMP 
reductase proteins. 


BL00487E 16.12 5.737e- 
26 154-199 BL00487F 
18.79 8.984e-22 235- 
276 BL00487G 26,82 
4.082e-12 287-329 


21 


BL00487 


IMP dehydrogenase / GMP 
reductase proteins. 


BL004a7B 16,12 5.737e- 
26 154-199 BL00487r 
18.79 8.984e-22 235- 
276 BIi00487G 26.82 
4.082e-12 348-390 


22 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 3.2S0e- 
26 302-333 
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SSQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


23 


BIjOOXOV 


Protein k:.nases ATP- 
binding region proteins. 


BLU0107A 18,3 9 3.250e- ~ 
26 302-333 


25 


BliOOllS 


Eukaryotic rna 
polymerase II 
heptapeptide repeat 
proteins . 


BL00115T 8.45 7.273e- 
29 1208-1242 BL00115Q 
18.08 2.776e-21 953- 
983 BL00115Y 11.86 
8.000e-17 1604-1650 
BLOOllSM 19.19 8.130e- 
16 731-774 BL00115H 
14.34 9.392e-16 463- 
496 BL00115A 15,44 
7.414e-15 43-82 
BLOOllSR 6.50 6.128e- 
14 9S'^-10lft tiT.nmiCT 

**■— ^ ^ *^ UXiv w X 

16.71 S.289e-14 591- 
617 BL00115I 8.33 
4.336e-13 «;t^-c;qo 
BLOOllSL 12.25 S.939e- 
13 662-694 BL00115G 
11.65 6,011e-13 435- 
463 BLOOllSK 15.03 
3.417e-10 617-659 
BL00115O 16.76 5.80Se- 
10 863-913 B1jQ0*13<;p 
11.54 7.538e-10 913- 
953 BLOOllSS 18.24 
7.968e-10 1010-1052 
BL00115U 10.34 4.47Se- 
09 1242-126S 


26 


BIi00420 


Speract receptor repeat 
proteins domain 
proteins. 


BL00420A 20.42 4.109e- 
11 81-110 BL00420A 
20.42 8.820e-10 84-113 


27 


BL00050 


Ribosomal protein L23 
proteins. 


BL00050A 23.71 9.250e- 
27 94-127 BL00050B 
14.81 8,12Se-12 133- 
147 


28 


PR00925 


NONHISTONE CHROMOSOMAli 
PROTEIN HMG17 FAMILY " ' 
SIGNATORE 


PR00925B 3.73 3.089e- 
iO 41-54 


29 


PF00756 


Putative esterase. 


PF00756C 14.12 l.lOSe- 
09 486-516 


32 


B3j00557 


FMN-dependent aipna- 
hydroxy acid 
dehydrogenases proteins . 


BLO0S57D 17.76 5,065e- 
37 274-316 BL005B7A 
3S,08 8.909e-29 24-73 
BLO0SS7C 15.59 l.OOOe- 
28 227-257 BL005S7B 
21.27 8.8986-22 130- 
169 


34 


PR00629 


SHC PHOSPHOTYROSINE 
INTERACTION DOMAIN 
SIGNATURE 


PR00629E 9.90 5,886©- 
35 299-328 PR0062SF 
10.95 8.364e-32 334- 
361 PR00629B 13.66 
3.7B6e-27 224-247 
PR00629A 13.45 8,364e- 
21 206-222 PR00629C 
3.80 4.000e-12 249-261 
PR00629D 12.45 3.739e- 
11 276-286 


35 


PD01270 


RECEPTOR FC 
IMMUNOGLOBULIN AFPIN. 


PD01270A 17,22 l.OOOe- 
40 39-79 PD01270B 
22.18 2,875e-38 94-131 
PD01270D 24,66 3.700e- 
34 171-207 PD01270C 
19.54 3.4SSe-30 137- 
166 


36 


P0O127O 


RECEPTOR FC 
IMMUNOGLOBULIN AFFIN. 


PD01270A 17,22 l.OOOe-"" 
40 39-79 PD01270B 
22.18 2.fl75e-38 94-131 



i 
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SEQ ID NO: 


ACCESSION 
NO, 


DESCRIPTION 


RESULTS* 








PD01270D 24-66 3 , 700e- 
34 171-207 PD01270C 
19.54 3.455e-30 137- 
166 


37 


BL004X2 


Neuromodulin (GAP- 43) 
proteins. 


BL00412C 10.28 9.24le- 
10 264-298 


38 


aL00412 


Neuromodulin (GAP- 43 ) 
proteins. 


BL00412C 10.28 9-241e- 
10 264-298 


39 


BL00412 


Neuromodulin (GAP-43) 
proteins . 


BL00412C 10.28 9.241e- 
10 264-298 


40 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00360B 12.64 7. 366e- 
14 342-360 PR00380C 
13 -18 6.927e-13 375- 
394 PR00380D 9.93 
2.180e-12 429-451 
PR00380A 14.18 5.154e- 
12 143-165 


44 


BL00345 


Ets-clomain proteins. 


BL0034SB 21.28 l.OOOe- 
40 239-290 BL0034SA 
13.96 2.452e-14 204- 
223 


45 


Bti00345 


Ets-domain proteins. 


BL0034SB 21,28 l.OOOe- 
40 215-266 BL0034SA 
13.96 2.452e-14 180- 
199 


46 


DM015S1 


kw OSTEOINDUCTIVE YOPM 
MEMBRANE OUTER. 


DM01551A 15.63 3.538e- 
26 172-202 DM015S1C 
14.62 3*S71e-17 232- 

<i 9 4 Ul'. UX93Xi3 0,0*X 

4.750e-ll 214-226 


47 


PR00876 


NEMATODE METAIiLOTHIONEIN 
SIGNATURE 


PR00876B 7,66 9.328e- 
11 246-260 


48 


PD01066 


PROTEIN ZINC FINGER 
ZINC-FINGER METAL- 
BINDING NU. 


PD01066 19.43 4.231©- 
33 6-45 


50 


BL00972 


Ubiguitin cart>oxyl- 
te rmi nal hydrol a.se s 
family 2 proteins. 


BL00972D 22. SS 7 . 7S0e- 
19 994-1019 fiL00973A 
11.93 7.120e-18 216- 
234 BL00972E 20.72 
9.471e-14 1020-1042 
BL00972C 16.48 7.000e- 
13 360-375 BL00972B 
9.45 -8.2696-10 302-312 


51 


BL00972 


Ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins. 


BL00972D 22.55 7.750e- 
19 990-1015 BL00972A 
11.93 7.12Oe-10 216- 
234 BL00972E 20.72 
9.47le-14 1016-1038 
BL00972C 16.48 7,000e- 
13 360-375 BL00972B 
9.45 8.269e-10 302-312 


52 


Bt»0111S 


GTP -binding nuclear 
protein ran proteins. 


ELOlllSA 10.22 3.063e- 
14 10-54 


53 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 S.SOOe- 
17 20-38 PR00988F 
12.23 7.828e-lS 196- 
210 PROOSeaC 13.64 
6.108e-14 104-120 
PR0098BE 8.27 3.872e- 
11 174-186 PR00988D 
5.95 6.878e-10 160-171 
PR009a8B 11,60 2.9l5e- 
09 57-69 


55 


PR00762 


CHLORIDE CHANNEL 
SIGNATURE 


PR00762C 9.29 4.682e- 
21 294-314 PRO0762D 
11.29 4.103e-19 509- 
S30 PR00762A 14.22 
9.333e-18 199-217 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








PR00762F 15.12 3.100e- 
16 563-583 PR00762B 
12.12 6.G63e-l6 230-- 
250 PR0C762E 12,07 
2.286e-15 545-562 
PR00762G 14,13 6.276e- 
13 601-616 


56 


BL00216 


Sugar transport 
proteins . 


BL00216B 27,64 S.BOOe- 
10 153-203 


58 


PF00791 


Domain present in ZO-1 
and UncS-like netrin 
receptors . 


PF00791B 28.49 2.049e- 
10 1080-1135 


59 


PF00791 


Domain present in ZO-1 
and UncS-like netrin 
receptors. 


PF00791B 28.49 2.049e- 
10 1062-1117 


61 


PD01929 


KINASE TYPE RESISTANCE 
ANTIBIOTIC TRANSFERASE 
AM. 


PD01929E 10.76 9.018e- 
09 206-221 


68 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360A 14.59 7.395c- 
09 680-693 


69 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360A 14.59 7.395e- 
09 670-683 


70 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15,00 B.714e- 
10 Sl-64 


72 


DM00179 


w KINASE AliPHA ADHESION 
T-CELL, 


DM00179 13.97 5.304e- 
09 108-118 


73 


BIi00239 


Receptor tyrosine kinase 
class 11 proteins . 


BL00239B 25.15 7,075e- 
12 118-166 


74 


BL00790 


Receptor tyrosine kinase 
class V proteins* 


BL00790N 13.25 6.il6e- 
10 93-120 


76 


DM00471 


0 PROKARYOTIC DNA 
TOPOISOMBRASE I, 


DM00471A XI, 73 9.3S7e- 
13 53-66 DM00471B 
8,45 4.8S7e-12 70-81 


80 


PD02876 


DECARBOXYLASE 
PHOSPHATIDYLSERIWE, 


PD02876C 8,80 2.723e- 
13 223-236 PD02876D 
12.13 2.588e*-l2 334- 
351 


81 


PD02876 


DECARBOXYIiASE 
PHOSPHATIDYLSERINE . 


PD02e76C 8.80 2.723e- 
13 282-295 PD02876D 
12.13 2.S88e-12 393- 
410 


83 


BIi00708 


Prolyl endopeptidase 
family serine proteins. 


BL00708B 24.91 7.197e- 
12 S70-601 


64 


PRO 0014 


FIBRONECTIN TYPE III 
REPEAT SIGNATURE 


PR00014C 15.44 8.043e- 
09 985-1004 


86 


PR00678 


PI3 KINASE pes 
REGULATORY SUBUNIT 
SIGNATURE 


PR00678H 9.13 l,379e- 
09 246-269 


89 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320C 13.01 8.200e- 
09 264-279 PR00320B 
12.19 8,650e-09 264- 
279 


93 


BIj00455 


Putativ^e AMP-binding 
domain proteins. 


BL00455 13.31 2.588e- 
14 316-332 


95 


BLOC 10 7 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18,39 4,000e- 
10 123-154 


96 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 4.000e- 
10 212-243 


97 


PR00081 


GLUCOSE/RIBITOL 
DEHYDROGENASE FAMILY 
SIGNATUiiE 


PR00081B 10.38 6.318e- 
13 134-146 PROOOeiA 
10.53 2.SO0e-12 54-72 


98 


.PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 S.SOOe- 
24 401-423 PR00380D 
S.93 7,188e-20 613-635 
PR0O38OB 12.64 7.S17e- 
16 S29-S47 PR00380C 
13.18 2.7S6e-13 560- 
579 
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SEQ ID NO: 


ACCESSION 
No. 


DESCRIPTION 


RESULTS* 


102 


PR00300 


ATP-DEPEMDENT CLP 
PROTEASE ATP- BINDING 
SUB'JNIT SIGNATURE 


14 289-308 


104 


BL 00479 


Piiorbol esters / 
diacylglycerol binding 
domain proteins. 


BLC0479B 12.57 6.786e- 
x« zye-314 BL00479A 
19-86 4.9l3e-16 1S5- 
178 BL00479A 19.86 

BL0a479B 12.57 6.294e- 
12 181-157 


106 


BL01019 


ADP-ribosylatioa factors 
family proteins. 


BL01019A 13-20 e.0l3e- 
12 43-83 


107 


DM019VO 


0 KW 2K632.12 yDR313C 
ENDOSOMAL III 


UMU1970B 8.60 S.OOOe- 
lb 403-416 


108 


BL00191 


Cytochrome bs family^ 
heme -binding domain 

Drotf»*! rta 


3Ir00l91K 17,38 4.9Sle- 
27 238-282 BL00191J 
11,37 6.447e-17 182- 
204 


1Q9 


PD01066 


t rv.w i Oi X IK ^XiVl- It XKVsCK 

ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 4-938e- 
37 8-47 


110 


Blj01138 


Scorpion short toxins 
proteins . 


BL01138A 10,96 8.297e- 
10 38-50 


113 


BL»00107 


Protein Icinases ATP- 
binding region proteins . 


BIj00107A 18.39 S.SOOe- 
23 156-187 BL00107B 
13.31 9-100e-l4 225- 
241 


117 


BL00214 


^y-vsoxic raccy-acia 
binding proteins. 


BLO0214B 26,51 l.OOOe- 
17 46-91 BL00214A 
21.17 7.0S2e-ll 5-31 


118 


Br,00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18,39 8.S60e- 
13 36-67 


119 


PR00329 


GOWADOTROPHIN RELEASING 
HORMONE RECEPTOR 


PR00529C 11,03 7.506e- 
10 158-177 


120 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATTIRt^ 


PR00320C 13.01 9,400e- 
09 80-95 


121 


PR00320 


G- PROTEIN BETA WD- 4 0 
REPEAT SIGNATURE 


PR00320C 13.01 9-400e- 
09 80-95 


127 


BL00215 


Mitochondrial energy 
transfer proteins . 


BLO0215A 15.82 7.158e- 
13 216-241 


128 


BL01032 


Protein nho^tilist-aeo ot^ 
proteins . 


BL01032C 6.14 3»195e- 
12 147-157 BL01032H 
o,obue-i,i 3jIL8~ 
331 BL01032G 8.33 

BL0l032r 10.42 8.902e- 
09 379-389 


129 


81,01310 


ATPIGI / PliM / MAT8 
family proteins. 


BLOISIO 14.74 5,694e- 
26 28-64 


13 0 


PR00990 


RIBOKINASE SIGNATURE 


PR00990B 12.32 9.S34e- 
15 47-67 PR00990A 
16.23 5.500e-14 20-42 
PR00990C 12.62 2.4l2e- 
09 119-133 


J-<iO 


BL00880 


Acyl - CoA-blnding 
protein.. 


BL008ao 17-52 5^S76e- 
26 72-122 


134 


BLO0030 


Eukaryotic RNA-binding 
region RNP-i proteins. 


BL00030A 14.39 9.308e- 
14 18-37 


135 


PR00215 


NEUR0MODUX.IN SIGNATURE 


PR00215C 13.98 6.7796-* " 
10 475-496 


13 6 


BL01310 


ATPlGl / VL« / MATS 
family proteins. 


BL01310 14.74 5.432e- 
29 71-107 


140 


BL00028 


Zinc finger, C2H2 type, 
domain proteins . 


BL00028 16.07 7.882e- 
14 214-231 BL00028 
16.07 9.471e-l4 102- 
119 BL0O028 16.07 
2.S00e-13 18-35 
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ACCESSION 
NO. 


DESCRIPTION 


RESUIiTS " 








BL00028 16.07 S.SOOe- 
13 74-91 BL00028 
16.07 9.100e-13 186- 
203 BL00028 16.07 
8.043e-12 46-63 
Bb00028 16.07 8.435e- 
12 130-147 BIi00028 
16.07 9.217e-12 270- 
287 BL0002a 16.07 
6.192e-ll 242-259 
BL00028 16,07 4.000e- 
10 158-175 


141 


BIjOOSOI 


Signal peptidases I 
serine proteins. 


BLO0501D 16.69 9.538e- 
14 113-133 BI.O0SO1C 
9.61 e.688e-10 89-101 


143 


BL01020 


SARI family proteins. 


BIj01020C 15.35 7.722e- 
20 79-130 


146 


PD01066 


PROTEIN ZINC FINGER 
ZINC-FINGER METAL- 
BINDING NU. 


PD01066 19.43 6.400e- 
25 335-374 


149 


BL00126 


3 tS» -cyclic nucletotxde 

phosphodiesterases 

proteins. 


BL00126C 22.07 i.4S0e- 
25 509-550 BL00126E 
35-22 3.951e.-16 654- 
709 BL00126D 25.50 
1.360e-15 565-604 
BU00126B 15,20 8.200e- 
11 483-493 BL00126A 
27,56 a.759e-ll 442- 
479 


151 


BL00632 


Ribofiomal protein S4 
proteins. 


BL00632 23,79 5.271e- 
20 106-149 


154 


BL00559 


Eukaryotic moiybdopterin 
oxidoreductases 
proteins . 


BL00559I 13.63 5.304e- 
19 29-58 BL00559K 
13.17 2,957e-18 172- 
199 BL00559J 19.63 
8.385e-13 99-151 
BL005S9L 13.60 5.814e- 
12 241-259 


155 


PRO 04 4 9 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 1.692e- 
13 13-35 


157 


BL00406 


Actins proteins. 


BL.00406D 12.58 2.S47e- 
18 275-330 BI.00406A 
9.95 S.776e-3.6 15-50 
BL00406B 5.47 7,429e- 
12 69-124 BL00406C 
6.75 9,682e-12 128-183 


160 


BL00132 


Zinc carboxypeptidases, 
zinc-binding region 1 
proteins . 


Bb00132A 26.07 7.000e- 
14 22-63 BL00132C 
21,35 3,466e-12 104- 
145 


165 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 9. 043e- 
13 139-158 


168 


BL00362 


proteins . 


i31juu ^4.D/ y,/ooe — 
15 129-172 


169 


Bli00039 


DEAD- box subfamily ATP- 
dependent helicases 
proteins. 


BL00039D 21.67 1,000s- 
35 640-686 BL00039A 
18 .44 1.964e-13 212- . 
251 BL00039B 19,19 
4.553e-13 378-404 
BL00039C 15.63 8.773e- 
12 465-489 


175 


PR0Q4 49 


TRANSFORMING PROTEIN P2l 
RAS SIGNATURE 


PR00449A 13.20 3 . 721e- 
12 14-36 


178 


BIj01310 


ATPIGI / PLM / MAT8 
family proteins. 


BL01310 14-74 2.432e- 
29 133-169 


179 


PO01066 


PROTEIN ZINC FINGER 
2 INC- FINGER METAL- 


PD01066 19.43 9.455e- 
36 6-45 
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ACCESSION 
NO. 


DESCRIPTION 
BINDING WU. 


Kbc>UJjJli> * 


180 


PROO007 


COMPLEMENT ClQ DOMAIN 
SIGNATURE 


PRO00U7B 14.16 7.429e- 
20 160-180 PR00007A 
19.33 4.938e-19 133- 

x'j^.uuuy/^ lb. 60 
1.225e-15 206-228 
PR00QO7D q fid. c oqca 
11 238-249 


181 


BX.00027 


• Homeob03< ' domain 
proteins. 


BL00027 25.43 9.S26e- " 
24 280-323 


182 


31,00027 


' Home obox • doma i n 
proteins . 


BL00027 25.43 9.526e- 
24 263-305 


183 


BL00027 


' Hone obox ' doma i n 
proteins. 


BL00027 26.43 9.526e- 
24 280-323 


184 


BL00027 


' Homeobox • domai n 
proteins. 


BL00027 26.43 9.526^^^ 

24 263-30€ 


188 


PR00929 


AT -HOOK- LIKE DOMAIN 
SIGNATURE 


PR00929C 5.26 3,328e- " 
09 460-471 


189 


PR00929 


AT -HOOK- LIKE DOT^AIN 
SIGNATURE 


PR00929C S.26 3.328e- 
09 440-451 


190 


BL00383 


Tyrosine specific 
protein phosphatases 
proteins. 


BL00383F IS. 51 7.188e- 
17 666-682 BIi00383A 
13,34 8.7l4e-17 162- 
177 BL00383E 10.35 
l,000e-14 333-344 
BL00383E 10.35 7.300e- 
14 628-639 BL00383P 
15.51 1.720e-13 371- 
387 BL00383C 10,10 
3.000e-13 217-228 
BL00383D 11.92 7.000e- 
13 295-308 BL00383B 
7.61 1.692e-ll 187-196 
BL00383C 10.10 1 . 750e- 
09 509-520 BL00383D 
11.92 4.000e-09 589- 
€02 BL003e3B 7.61 
8.000e-09 479-488 


191 


PRO 04 50 


RECOVER IN FAMILY 
SIGNATURE 


PR00450C 12.22 7 . 911e- 
15 83-105 PR00450C 
12.22 6.286e-13 47-69 


193 


PF00S^4 


Uctioosapep tides repeat 
proteins. 


jn't uu;>tj^B ii4 . 74 6.164e- 
16 227-278 


194 


PR00S03 


BROMODOMAIN SIGNATURE 


e'l^uv^ujjj -sU . HI 9.156e- 
15 204-224 PR00503B 

9.96 9 R7Ti*-.1"l TTrt IDT 


195 


BJLU0901 


Cysteine 

synthase/cystathionine 
beta- synthase P- 
phosphate att. 


BL00901C 20.63 3,429e- 
18 67-117 


197 


BIiO0636 


Nt-dnaJ domain proteins. 


BL00636A 8.07 6,211e- ' 
17 40-57 BL00636B 
15.11 2,000e-13 67-88 


198 


PR00690 


ADHESIN FAMILY SIGNATURE 


PR00690A 10.86 9, 8666-' 
09 463-482 


199 
201 




Ribosomal RNA adenine 
dimethylases proteins. 


BL01131A 26.62 2,343e- 
12 84-130 




PR00910 


LUTEOVIRUS 0RF6 PROTEIN 
SIGNATURE 


fKtJOgiOA 2.51 8.3S2e- 
12 509-522 


203 


UM0U21i> 


PROLINE -RICH PROTEIN 3 . 


OM00215 19.43 2,286e- 
10 39-72 


206 J 


fR0U2eil 


Cow DENSITY LIPOPROTEIN 
ihDL) RECEPTOR SIGNATURE 


PR00261A 11.02 4,462e-" 
19 65-87 PR002S1C 
11,37 9.308e-19 65-87 
PR00261D 12.47 2.e67e- 
18 65-87 PR0026ia 
14,12 4.000e-18 143- 
165 PR00261A 11.02 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULrs* 








4.S33e-18 143-165 
PR00261D 12,47 7.S00e- 
18 143-165 PR00261B 
14.12 5.06Se-16 65-87 
PR00261C 11.37 8.967e- 
16 143-165 PR00261F 
11.57 4.938e-13 143- 
165 PR00261E 11,08 
7.180e-13 65-87 
PR00261F 11.57 7.188e- 
13 65-87 PR00261S 
11.08 1.6436-11 143- 
165 


209 


PFQ0791 


Domain present in zo-1 
and UncS-like netrin 
receptors. 


PF00791B 28.49 6.143e":: — 
13 118-173 PF00791C 
20.98 7.680e-10 132- 
171 


211 


PR00007 


COMPLEMENT CIQ DOMAIN 
SIGNATURE 


PR00007A 15.33 S,731e- 
19 131-158 PR00007B 
14.16 4.115e-18 158- 
178 PR00007C 15.60 
1.675e-lS 201-223 
PR00007D 9.64 7.231e- 
11 233-244 


212 


BL00183 


Ubiqui tin- conjugating 
enzymes proteins , 


BL00183 28.97 l,545e*^^ 
30 43-Sl 


213 


BL001S3 


ubiqui tin-conjugating 
enr.ymes proteins. 


BL00183 28,97 l,S45e- 
30 43-91 


215 


BIi0003 9 


DEAD-bojc subfamily ATP- 
dependent helicaaes 
proteins . 


BL00039D 21.67 1.900e- 
29 568-614 BL00039A 
18.44 X,871e-23 21-60 
BL00039C 15.63 1.720e- 
11 364-388 BL00039B 
19,19 4.064e-H 277- 
303 


217 


BLOOIDO 


Chloramphenieoi 
acetyl traneferaoe 
proteins - 


BLOOIOOD 17.22 8.484e- 
09 68-106 


219 


PR00213 


MYELIN PO PROTEIN 
SIGNATURE 


PR00213C 15.94 3.969e- " 
11 199-227 


222 


BL00678 


Trp-Asp (WD,) repeat 
proteins proteins. 


BL00678 9.67 l-947e-09 
144-155 


224 


PRO 08 75 


MOIiUSC METALLOTHiONEIN 
SIGNATURE 


PR0Oa7SA 5.83 l.OOOe- 
09 901-913 


225 


BIi00636 


Nt-dnaO" domain proteins . 


BL00636B 15.11 8.200e-' 
IS 18-39 


226 


BL00636 


Nt-dnaj domain proteins. 


BL00636A 8.07 l.OOOe- 
21 21-38 BIi00636B 
IS. 11 8.200e-19 45-66 


229 


PR00301 


70 KD HEAT SHOCK PROTEIN 
SIGNATURE 


PR00301F 13,98 7.563e- 
13 329-346 PR003Q1G ' 
13.78 4.300e-12 361- 
382 


230 


BL00460 


Glutathione peroxidases 
selenocysteine proteins . 


oJuuu^ou/i , 6 / 8 . 773 e- 
20 35-70 BL00460B 
9.73 7.429e-l€ 78-96 
BL0Q46OC 14,35 2.831e- 
12 111-134 BL00460D 
16.89 8,773e-ll 140- 

leo 


231 


PR00647 


SEKR ORPHAN RECEPTOR 
SIGNATURE 


PR00647B 10.19 8.522e- 
09 273-287 


233 


BL00292 


Cyciins proteins. 


BL00292B 20.31 7.429e- " 
27 244-275 BL00292A 
22.87 7.750e-27 201- 
235 


234 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 6.308e- 
13 7-29 PR00449C 
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ACCESSION 
NO. 


DESCRIt»TrON 


RESULTS* 








17.27 4.462e-ll 47-70 
PR00449D 10.79 7.120e- 
11 109-123 


235 


PR00019 


LEUCINE- RICH REPEAT 
SIGNATURE 


PR00019B 11.36 7.300e- 
10 251-265 PR00019B 
11.36 S.320e-09 119- 
133 PR0OO19B 11.36 
l.OOOe-08 229-243 


236 


PRO 0019 


LEUCINE- RICH REPEAT 
SIGNATURE 


PR00019B 11,36 7.300G- 
10 245-259 PR00019B 
11.36 5.320e-09 113- 
127 PR00019B 11.36 
l.OOOe-OS 223-237 


237 


PD00289 


PROTEIN SH3 DOMAIN 
REPEAT PRESYNA. 


PDO0289 9.97 8.448e-09 
67-81 


240 


PROOOll 


TYPE III EGF-LIKE 
SIGNATURE 


PROOOllD 14.03 3,492e- 
10 616-635 


241 


PRO 00 11 


TYPE III EGF-LIKE 
SIGNATURE 


PROOOllD 14.03 3,492e- 
10 616-635 


244 


BL00903 


Cytidir.e and 
deoxycytidylate 
deaminases zinc -binding 
region s . 


BL00903 12.93 8.941e- 
12 54-64 


245 


DM00179 


w KINASE ALPHA ADHESION 
T-CELL . 


DM00179 13.97 8.043e- 
09 124-134 


248 


BI.0 0 24 6 


Wnt-1 tamily proteins. 


BL00246D 23.97 l.OOOe- 
40 186-239 BL00246E 
20.32 l,000e-40 305- 
351 BL00246B 13.69 
4.176e-36 105-140 
BL00246A 15.75 2.286e- 
24 70-90 BL60246C 
15.56 4.8S7e-22 150- 
175 


250 


PR00927 


ADENINE NUCLEOTIDE 
TRANSLOCATOR 1 SIGNATURE 


PR00g27E 14,93 S.114e- 
10 253-275 


254 


BL00G74 


AAA-protein family 
proteins , 


BL00674B 4.46 l.OOOe- 
09 223-245 


255 


PD01796 


PROTEIN TRANSMEMBRANE 
COBALT ZINC CADMIU. 


PD01796 15,01 6.045e- 
09 61-88 


255 


BL50002 


Src hotnolocfy 3 (SH3) 
domain protaina profile , 


BLSOO02B 15.18 2.800e- 
10 421-435 


259 


PR00094 


ADENYLATE KINASE 
SIGNATURE 


PR00094C 12.94 2.200e- 
18 87-104 PR00094D 
12.52 2.731e-14 161- 
177 PR00094A 10.31 
5.500e-14 11-25 
PR00094B 11.01 4.115e- 
13 39-54 PR00094E 
11.25 7,333e-13 178- 
193 


259 


BL00892 


HIT family proteins. 


BL00892A 18.17 S.SOOe- 
13 60-91 


262 


Bi:j00388 


Proteasome A- type 
oiikiinits protftxns . 


BLOOaeSA 23.14 l.OOOe- 
40 8-54 BL00388B 
31.38 3.864e-33 66-108 
BL00388D 20.71 l.OOOe- 
21 153-184 BL00388C 
18.79 8.147e-16 126- 
148 


264 


BL0O903 


C^tidine and 
deoxycytidylatc 
deaminases zinc-binding 
region s. 


BL00903 12.93 5.821e- 
09 91-101 


267 


BL00107 


Protein kinases ATP- 
binding region proteins . 


EL00107B 13.31 1.529e- 
09 241-257 


270 


BI,00226 


Intermediate filaments 
proteins. 


BL00226D 19.10 l.OOOe- 
37 362-409 3L00226B 
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NO . 


DESCRIPTION 


RESUJbTS* 








23 . 86 8 .043e*-35 196- 
244 BI>00226C 13.23 
7.000e-2Q 261-292 
BL00226A 12.77 6 . 143e- 
15 96-111 


271 




ff T W AS E TRANS FERAS E 
CHOLIME PROTEIN 
MUIiTIGENE FAMI - 


PD02952C 15,76 9.731e- 
Z.6 235-265 PD029S2B 
15,57 5.62Se-09 215- 
229 


272 


PD02929 


ADHESION GLYCOPROTEIN 
PRECURSOR I. 


PD02929A 28.27 l.OOOe- 
40 106-160 PD02929B 
18,36 8.800e-17 179- 
199 


274 


BL01O27 


family 39 proteins. 


BL01027B 15,34 3.486e- 
09 213-250 


275 


PR00424 


ADENOSINE RECEPTOR 
SIGNATURE 


PR00424D 14.32 6.451e" 
11 39-59 


277 


BL0OO52 


Ribosomal protein S7 
proteins . 


BL00052A 27.85 6.000e- 
13 137-184 BL00052B 
13.1/ b.JL4jC-X^ ^UO — 
235 


279 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


13 267-294 


280 


PR00319 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


fKUUJiyL/ o.o^oe- 
23 107-125 PR00319C 
Jl.3t41 j-.uuue— *A oi?— XV3 
PR00319A 15.27 8.364e- 
21 SX-68 PR00319B 
11.47 8.200e-19 70-8S 


281 


PR00319 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR0031SD 11.64 6.62Se- 
23 94-112 PR00319C 
13.41 l.OOOe-21 76-92 
PR00319A 15.27 8.3646- 
21 38-55 PR00319B 
11.47 8.200e-19 S7-72 


287 


PF00929 


Exonuclease. 


PF00929D 16.17 7.366e- 
09 149-163 


291 


Bri00326 


Tropomyosins proteins. 


BL00326A 14.01 2.360e- 
09 93-127 


292 


BL00326 


Tr opomy OS i n e p ro t e xna . 


BIi00326A 14.01 2.360e- 
09 93-127 


2 94 


PD00066 


PROTEIN ZINC- FINGER 
WETAL-BINDI . 


PD00p66 13.92 8.714e- 
12 203-216 


295 


RL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BIi00028 16 .-07 S.SOOe- 
15 322-339 BL00028 
16.07 9.471e-14 4'33- 
450 BL00028 16.07 
4 .600e-13 648-665 
BL00028 16.07 S.SOOe- 
13 760-777 BL00028 
16.07 9,S50e-13 788- 
805 BL00028 16.07 
3.348e-12 704-721 
BL00028 16.07 6.478e- 
12 461-478 BL00028 
16.07 8.43Se-l2 844- 
861 Bli00028 16.07 
1.692e-ll 593-610 
BI>00028 16.07 2.0386- 
11 211-228 BIi00028 
16,07 5,154e-ll 732- 
749 BL00028 16.07 
5.846e-ll 377-394 
61)00028 16.07 6.885e- 
11 816-833 BL00028 
16.07 7.231e-ll 676- 
693 BL00028 16.07 
9.6S4e-ll 564-581 



200 



wo 01/53312 



PCTAJSOO/34263 
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ACCESSION 
NO . 


DESCRIPTION 


RESULTS* 








BL00028 16.07 4.086e- 
09 517-534 BL00028 
16.07 7.429e-09 489- 
506 


296 


BL0021S 


Mitochondrial energy 
transfer proteins. 


BLj00215A 15.82 8.333e- 
16 111-136 BIiOOaiSA 
15,82 2,723e-ll 10-35 
BL00215B 10,44 9.526e- 
11 152-165 Bt>00215B 
10.44 7.37Se-10 59-72 
BL0021SA 15.82 9.e24e- 
10 205-230 


302 


PF00953 


G 1 ycosy 1 t r ans f e r a s e . 


PF009S3C 19,70 8.773e- 
34 236-269 PF00953A 
19.68 S.OOOe-25 102- 
129 PF009S3B 6,17 
l.OOOe-13 182-194 


304 


PP00152 


tRNA synthetases claaa 
II . 


PP00152D 21,30 8.3G4e- 
28 422-461 PF001S2C 
28.03 9.250e-21 220- 
257 PF001S2B 15.67 
2.6S8e-13 1S9-184 
PF00152A 19.68 S.714e- 
11 44-67 


305 


PD01066 


PROTEIN ZINC FIKGER 
2INC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 8.250e- 
35 37-76 


305 


PD027a4 


PROTEIN NUCLEAR 
RIBONUCIiEOPROTEIN . 


PD027a4B 26.46 5.840e- 
09 92-135 


307 


PRO 0454 " 


ETS DOMAIN SIGNATURE 


PR00454C 11.24 7,80ae- 
09 1167-1186 


303 


PRO 02 3 7 


RHODOPSIN-IilKB GPCR 
SUPERFAMILY SIGNATURE 


PR00237E 13.03 S.OSle- 
13 188-212 PR00237G 
19.63 7.207e-13 268- 
295 PR00237A 11.48 
4.375e-ll 24-49 
PR00237C 15.69 3.057e- 
10 101-124 PR00237D 
8.94 4,750e-10 137-159 
PR00237P 13.57 S.364e- 
10 230-255 PR00237B 
13.50 9.438e-10 57-79 


309 


BL00522 


DNA -polymerase famiXy X 
proteins . 


BL00522C 11.90 7.S77e- 
24 315-33 9 BL00522F 
14.90 1.3lOe-15 470- 
494 BI.00522A 25.52 
l,265e-14 179-226 
BL0OS22E'~19.63 8.615e- 
14 430-400 £LiU03^«I5 
27.30 9.625e-12 267- 
313 


310 


BL0032^ 


Tropomyosine proteins. 


B1.00326D 8.76 S.23Se- 


312 


BL00290 


Immunoglobulins and 
major hi.stiocoinpatibilltiy 
complex proteins. 


BXi00290A 20.89 4.706e- 
14 151-174 BIj00290B 
13.17 g,000e-12 211- 
229 


313 


BL00345 


Eta-domadln proteins. 


BL60345B 21.28 l.OOOe- 
40 34-85 Bt*00345A 
13.96 9.217e-l6 1-20 


315 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins . 


PF00651 15.00 5.09.1e- 
15 63-76 


317 


BLO102O 


SARI family proteins. 


BL01020C 15.35 3.198e- 
17 79-130 


318 


BL00216 


Sugar transport 
proteins . 


BL00216B 27.64 4.696e- 
11 164-214 


320 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 


PR00109B 12.27 4.814e- 
10 216-235 
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SEQ ID NO; 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






SIGNATURE 




321 


BL00027 


' Homeobox ' cloma in 
proteins . 


BL00027 26.43 5.6886- 
10 329-372 


322 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00103B 12.27 8.765e- 
12 558-577 


324 


BL01241 


Link domain proteins. 


BL01241 35.81 8.313e- 
30 183-236 BL01241 
35.81 3.222C-13 282- 
335 


326 


BL00412 


Neuromodulin (GAP- 43) 
proteins . 


BL00412D 16.54 4.000e- 
12 515-566 BL00412D 
16.54 S.7DSe-ll 516- 
567 BL00412D 16.54 
7.8486-10 518-569 
BL00412D 16.54 1.827e- 
09 514-565 BL00412D 
16.54 1.918e-09 513- 
564 BL00412D 16.54 
2-102e-09 520-571 


328 


BIiO0232 


Cadherins extracellular 
proteins , 


BL00232B 32.79 9.5S7e- 

32.79 2.246e-18 41-89 

18 370-418 BL00232B 
32.79 S.500e-16 258- 
306 BL00232B 32.79 
9.384e-XS 475-523 
BL00232C 10.65 2.537e- 
12 256-274 BL00232C 
10.55 4.326e-ll 368- 
386 BL00232C 10.65 
7.261e-ll 473-491 
BL00232C 10.65 7.4S7e- 
11 39-57 


330 


PR004S4 


KTS DOMAIN SIGNATURE 


PR00454C 11.24 7.808e- 
09 1167-1186 


33X 


BL00S98 


Chromo domain proteins. 


BL00S98 14.45 8.393e- 
18 27-49 


333 


BL01016 


Glycoprotease family 
proteins . 


BL01016C 22.84 3.92Se- 
32 70-115 BL01016E 
14.88 5.286e-l9 149- 
177 BL01016H 13.71 
7.577e-13 291-301 
BL01016D 8.86 3.298e- 
11 127-140 BL01016G 
7-14 5.622e-10 261-271 
BL01016A 5.65 7.167e- 
10 4-19 BL01016P 
13.34 l,563e-09 200- 
212 BL01016B 8.93 
8,855e-09 38-SO 


339 


BL01115 


GTP- binding nuclear 
protein ran proteins , 


BLOlllSA 10,22 S.SOOe- 
11 17-61 


340 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 1.231e- 
33 10-49 


341 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 5.042e" 
09 55-109 


342 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 2.400e- 
30 16-55 


343 


DM00031 


IMMUNOGLOBULIN V REGION. . 


DM00031A 16,80 l.OOOe- 
40 20-68 


346 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 4.764e- 
11 135-154 


347 


PR00109 


TYROSINE KINASE 


PR00109B 12.27 4,764e- 
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ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






CATALYTIC DOMAIN 
S IGNATCRE 


11 135-154 


351 


BL01187 


Calcium- binding EGF-like 
domain proteins pattern 
proteins . 


BL01187B 12.04 1.783e- 
13 100-116 BL01187B 
12.04 8.435e-13 276- 
292 BL01187B 12 - 04 
8.800e-ll 13-29 
BL01187B 12,04 7.429e- 
10 54-70 BL0H67B 
12.04 5,72Se-09 231- 
247 BL01187A 9.98 
7.000e-09 255-267 


3S2 


PD00078 


REPEAT PROTEIN ANK 
NUCLEAR ANKYR. 


PD00078B 13.14 S.950e-' 
10 366-379 PD0007aB 
13.14 4,522e-09 168- 
181 


354 


BIj00380 


Rhodanese proteins . 


BL00380F 9.76 €.694e- 
11 542-553 


355 


PF00628 


PHD-f xnger. 


PF00628 15.84 l.OOOe- 
11 116-131 


356 


PR0O587 


SOMATOSTATIN RECEPTOR 
TYPE 1 SIGNATURE 


PR00587A 8.06 3.700e- 
09 17-37 


3 59 


pnnn n^: *r 
tru\j yjUKt x> 


FRO 1 tlN ZINC- FINGER 
METAL- HINDI . 


PD00066 13.92 4.462e- 
15 261-274 PD00066 
13.92 6,500e-13 233- 
246 PD00066 13,92 
4.300e-O9 289-302 


361 


PF0O791 


Domain present in ZO-i 
and UncS-like netrin 


PF00791B 28.49 9.604e- 
13 54-109 PFO0791B 
28-49 1.095e-12 21-76 
PF00791A 27.85 1.432e- 
09 71-126 PP00791B 

239 


362 


PF00791 


DorridLin present in ZO-1 
and UncS-like netrin 
receptors . 


11 279-334 


363 


PR00450 


RECOVER IN FAMILY 
SIGNATURE 


PR00450C 12.22 S.OSOe- 
10 73-95 PR004SOC 
12,22 3,278e-09 109- 
131 


364 


PP00242 


DNA polymerase (viral) 
N- terminal domain 
protftina . 


PF00242Q 13.51 2.328e- 
09 22-68 


365 


PP00242 


DNA polymerase (viral) 
N-terminal domain 
proteins . 


PP00242Q 13.51 2.328e- 
09 22-68 


366 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 6.644e- 
09 103B-1092 


367 


PR0OO19 


LEUCINE -RICH REPEAT 
SIGNATURE 


PR00019B 11.36 1.360e- 
09 229-243 PR00019B 
11.36 6,040e-09 91-105 
PR00019A 11.19 8.667e- 
09 370-384 


368 


PROOOll 


TYPE III EGF-LIKE 
SIGNATURE 


PROOOllD 14.03 g.OOOe- 
IS 30-49 PROOOllA 
14,06 9,830e-15 30-49 
PROOOllB 13.08 4.500e- 
14 30-49 PROOOllC 
24.25 S.143e-09 6-35 


369 


BL01032 


Protein phosphatase 2C 
proteins . 


BL01032H 11.25 4.150e- 
09 417-430 


372 


BL0047a 


IiIM domain proteins. 


BL00478B 14.79 7.7S0e- 
12 410-425 


373 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 9.757e- 
34 26-65 


376 


PR00170 


SODIUM CHANNEL SIGNATURE 


PR00170E 6.48 2.739e- 



203 



3NSDOCID: <WO„0163312A1„I^> 



wo 01/53312 



PCT/USOO/34263 



SEQ ID NO: 


ACCESSION 
NO, 


DESCRIPTION 


RESULTS* -' ■ 


360 


BL00L07 


Protein kinases ATP- 
binding recfion proteins. 


10 88-118 

BL00107A 18.3 9 l.DOOe- 

23 276-307 BL00107B 
13-31 1.692e-12 342- 
358 


381 


Bi;00455 


Putative AMP-binding 
domain proteins. 


BL00455 13.31 S.714e- 
12 50-66 


3 82 


PR00624 


HISrONE H5 SIGNATURE ' 


PR00624G 4.08 4.900e- 
09 524-544 


384 


PD00078 


REPEAT PROTEIN AKK 
NUCLEAR ANKYR. 


PD00078B 13.14 S.9S0e- 
10 366-379 PD0007eB 
13.14 4.522e-09 168- 
181 


385 


PROOSll 


TEKTIN SIGNATURE 


PR00511D 7.11 S.371e- " 
09 67-80 


38b 


PD02870 


RECEPTOR IMTRRT.EUKIN-l 
PRECURSOR . 


PD02870B 18.83 6.000e-" 
10 97-130 


383 


PDQOOeG 


PROTEIN ZINC- FINGER 
METAL-DINDI, 


PD00066 13.92 B.OOOe- 
13 516-529 


383 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290A 20.89 7.657e- 
09 151-174 


390 


BL00215 


• litochondrial energy 
transfer proteins. 


BL00215A 15.82 5.206e- 
15 221-246 BL0021SA 
15.82 7.618e-14 20-45 
BL00215A 15.82 8.8516- 
11 123-148 BLO0215B 
10.44 9.526e-ll 69-82 
BL0021SB 10.44 7.300e- 
09 272-285 BLG021SB 
10.44 8.S00e-09 165- 
178 


394 


BI.00674 


AAA-proteirt family 
proteins. 


BL00674B 4.46 2.7236- 
16 299-321 


397 


PR00048 


C2H2-rYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 8,579e- 
11 141-155 


398 


yR00761 


BINDIN PRECURSOR 
SIGNATURE 


PR00761B 9,93 6.764e- " 
09 55-74 


399 


BIjO024O 


Receptor tyrosine kinase 
class III proteins. 


BL00240B 24.70 7-907e- 
10 118-142 


401 


PFO0676 


Dehydrogenase El 
component . 


PF0Q676B 24,71 8.0716- 
18 331-369 PPO0676D 
14.40 3.854e-lS 486- 
506 PP00676C 16.88 
9.182e-14 454-478 


402 


BL00S14 


Fibrinogen beta and 
gamma chains C- Germinal 
domain proteins. 


BL00S14C 17.41 4.673e- 
28 4432-4469 BL00S14G 
15.98 6.092e-14 4555- 
4585 BL00514D 15.35 
2.532e-12 4473-4486 
BL00514r 11.65 4,2eB&' 
10 4519-4534 BL00514H 
14.95 4,9SSe-10 4584- 
4609 


403 


PF00992 


Troponin. 


PF00992A 16-67 'S.974e- 
09 105-140 


404 


f«00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 11.36 1.450e- 
10 73-87 PR00019A 
11.19 B.043e-10 76-90 
PR00019B 11.36 l.OOOe- 
09 SO-64 PR00019B 
11.36 l.OOOe-09 96-110 


405 


BLO 023*2 


Cadherins extracellular " 
repeat proteins domain 
proteins. 


BLO 02 3 2B 32.79 9,S57e" 
20 13 9-187 BL00232B 
32.79 2.246e-18 29-77 
BL00232B 32.79 S.98Se- 
18 358-406 BL00232B 
32.79 S.50Oe-16 246- 



204 



wo 01/53312 



PCT/US00/342ri3 



SEQ ID NO: 
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NO. 
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RESULTS* 








294 BL00232B 32,79 
9.384e-15 463-511 
BL00232C 10.65 2.537e~ 
12 244-262 BL00232C 
10.65 4.326e-ll 35S- 
374 BL00232C 10.65 
7.26le-ll 461-479 
BL00232C 10.65 7.457e- 
11 27-45 


407 


PPO0426 


Outer Capsid protein VP4 
(Hemagglutinin) - 


PF00426S 15-67 5.634e- 
09 902-940 


409 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 9.69Se- 
09 126-180 


410 


BI.O0741 


Guanine -nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 2.731e- 
09 252-275 


411 


PF00646 


F-box domain proteins. 


PF00646A 14.37 6,344e- 
09 86-100 


412 


BI,O0603 


Thymidine kinase 
cellular- type proteins. 


BL00603B 11.39 e.SOOe- 
09 542-5S7 


415 


BLO0e66 


Carbamoyl -phosphate 
synthase subdomain 
proteins . 


BL00866B 36.29 3.S71e-. 
31 245-291 BL00866C 
23.26 9.000e-25 331- 
366 


418 


PR00239 


MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239E 1.58 6.114e- 
09 590-602 


421 


PF00791 


Domain present in ZO-1 
and Unc5-like net r in 
receptors . 


PF00791B 28.49 7.9S5e- 
14 23-70 PP00791B 
28.49 3.653e-12 273- 
■j^o irruu/yxo <£0.49 
4.273e-ll 1S6-211 
PF00791B 28.49 7.8l8e- 
11 89-144 PP00791B 
28,49 1.524e-10 56-111 
PF00791C 20,98 3.5S9e- 
09 37-76 PF00791C 
20. 9B 5.235e-09 170- 
209 PP00791C 20.98 
5.235e-09 381-420 
PF00791B 28.49 6.202e- 
09 189-244 PF00791B 
28.49 7.0286-09 435- 
490 ■PF00791B 26.49 
8.679e-09 367-422 


424 


DMO0e92 


3 RETROVIRAL PROTEINASE. 


DM00892C 23, 5S 7.207e- 
28 1645-1679 


425 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109D 17.04 5.881e- 
10 228-2S1 


429 


BLOOSIB 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 4.600e- 
11 31-40 


431 


BI1OOO39 


DEAD- box subfamily ATP- 
dependent helicases 
proteins. 


BL00039D 21.67 1.844e- 
34 490-536 BL00039A 
18.44 5.615e-19 205- 
244 BL00039B 19.19 
8.920e-l6 251-277 
BL00039C 15.63 5.78le- 
15 333-3S7 


432 


PRO04S2 


SH3 DOMAIN SiGNAUtJRB 


PR004S2B ii.65 7.652e- 
12 169-185 


433 


PR00828 


FORMIN SIGNATURE 


PR0082BB 5.23 8.218e- 
10 382-405 


436 


BL60415 


Synapsins proteins. 


BL00415N 4.29 8.643e- 
11 195-239 BL00415N 
4.29 3.036e-09 809-853 


443 


4*^00834 


HTRA/DEGQ PROTEASE 
FAMILY SIGNATURE 


PR00834F 10.91 6.040e- 
11 221-234 


446 


i'i:='O1140 


Matrix protein (MA) , 


PFdll40D 15.54 9.663e- 
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NO. 
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RESULTS* 






pl5. 


10 183-218 PF01140D 
15,54 3.093e-09 246- 
281 


449 


PRO0S68 


DOPAMINE D3 RECEPTOR 
SIGMATURE 


PRC0S68G 13.95 S.551e- 
09 39-53 


451 


PF00084 


Sushi domain proteins 
(SCR repeat proteins. 


PF00084B 9.45 3.8l3e- 
10 47-59 


452 


BL0079G 


Receptor tyrosine kinase 
class V proteins. 


Bri00790I 20.01 2.82le- 
09 618-649 


456 


PR0038C 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR003aOA 14.18 l.OOOe- 
25 77-99 PR00380D 
9.93 l.OOOe-21 281-303 
PR00380C 13.18 8.286e- 
17 230-249 PR003S0B 
12.64 4.724e-l6 194- 
212 


457 


PR00253 


GAMMA- AMINOBUTYRIC ACID 
(GABA) RECEPTOR 
SIGMATURE 


PR002S3A 9,15 9.143e- 
24 246-267 PR00253B 
13,47 2.0O0e-23 272- 
294 PR00253C 13.85 
7.000e-23 306-328 
PR002S3D 16.68 S.950e- 
21 452-473 


467 


PR00849 


GLYCOSYL HYDROIASE 
PAMIIiY 58 SIGNATURE 


PR00849D 9,77 9.236e- 
09 910-937 


471 


BIj00678 


Trp-Asp (VSrO) repeat 
proteins proteins. 


BL0a678 9.67 8.200e-12 
33-44 


4 72 


BL00226 


Intermediate filaments 
proteins . 


BL00226B 23.86 3 . 721e- 
09 282-330 


473 


BL00344 


GATA-type zinc finger 
dtunain proteins. 


BI1OO344 17-99 7.000e- 
12 814-852 


474 


BIi00481 


Thiol -act Iva ted 
cytolysins proteins. 


BI4OQ48IE 13.07 8.909e- 
09 173-199 


479 


PR00319 


BETA G-PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00319B 11.47 2 . 571e- 
09 393-409 


480 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDIKG NU, 


PD01066 19.43 1.900e- 
38 8-47 


481 


PR00405 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405C 19.41 l.OOOe- 
19 451-473 PR00405B 
11.83 4.333e-18 430- 
448 PR0040SA 17.71 
4.9716-18 411-431 


482 


PRQ0049 


WILM'S TUMOUR PROTEIN 
SIGMATURE 


PR00049D 0.00 9,286e- 
10 959-974 PR00049D 
0.00 9.857e-10 958-973 
PR00049D 0,00 1.30Se- 
09 937-952 PR00049D 
0.00 8.322e-09 939-954 


486 


PR00007 


COMPLEMENT ClQ DOMAIN 
SIGNATURE 


PR00007B 14.16 8.61Se- 
23 653-673 PR00007A 
19.33 6,192e-22 626- 
653 PR00007C IS. 60 
3 , ttabe-ia t>yo~ /^ju 
PR00007D 9.64 3,647e- 
13 732-743 


487 


PD00567 


PROTEIN RNA- BINDING RNA 
REPEAT HYD, 


PD00567B 18.23 2.8S3e- 
09 200-214 


488 


PR00988 


URIDINE" KINASE SIGNATURE 


PR00988A 6.39 4.S69e- 
12 3-21 


489 


PD01066 


PROTEIN ZINC FINGER 
ZINC-FINGER METAL- 
BINDING NU. 


PD01066 19.43 4.882e- 
27 30-69 PD01066 
19.43 3.430e-ld 71-110 


490 


PR00049 


WILM»S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 7.864e- 
09 663-678 


492 


BL01128 


Shikimate kinase 
proteins . 


BIj01128A 18.84 6.464e- 
17 58-92 


497 


PF00429 


ENV polyprotein (coat 


PF00429 31.08 7.a71e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






polyprotein) . 


15 21-71 


498 


BLG0120 


Lipases, serine 
proteins . 


BL00120B 11-37 7.923e- 
09 185-200 


500 


BL00030 


Eiikaryotic RNA- binding 
region RNP-1 proteins. 


BL00030A 14.39 7-353e- 
11 299-318 


501 


BL011S9 


WW/rsp5/WWP domain 
proteins . 


BL01159 13.85 8,579e- 
12 131-146 


505 


31,00021 


Kringle domain proteins. 


BL00021B 13.33 3.739e- 
17 492-510 


508 


PR00120 


H 4. TRANS PORTING ATPASE 
(PROTON PUMP) SIGNATURE 


PR00120C 9.90 S.aOOe- 
19 705-722 


509 


DM0 14 17^ 


6 kw INDUCING XPMC2 
MUSHROOM SPAC22G7.04, 


DM01417E 20.62 2.938e- 
16 362-395 DM01417D 
11.08 3.B00e-13 322" 
338 


SIO 


PFO0S34 


G lycoayl t rans f eras e a 
group 1 . 


PF00S34B 14.47 6-62Se- 
09 346-370 


511 


PF00S34 


GiycQsyl i^ranaf erases 
group 1, 


PF00S34B 14.47 6.62Se- 
09 293-317 


512 


PF00534 


Glycosyl transferases 
group 1 . 


PF00534B 14.47 6.62Se- 
09 366-390 


513 


PD01841 


PHOSPHORYIxASE KINASE 
ALPHA MUSCL. 


PD01841A 21.71 l.OOOe- 
40 110-160 PD01841B 
14,35 l.OOOe-40 181- 
222 PD01841D 17.87 
l.OOOe-40 243-295 
PD01841F 13.36 l.OOOe- 
40 333-382 PD0ia41G 
24.26 l.OOOe-40 386- 
440 PD01841L 18.42 
l,000e-40 968-1010 
PD0184ir 23.00 4.54Se- 
37 762-804 PD01841E 
18.60 3,7SOe-36 295- 
333 PD018410r 14.94 
6.023e~3S 851-888 
PD01841H 21.30 2 . 909e- 
33 490-527 PD01841K 
14.81 7.088e-33 924- 
954 PD01841C 13.78 
9.3B6e-23 222-243 
PD01841M 10.82 8.594e- 
21 1054-1073 PD0ie41I 
23.00 2.667e-13 549- 
591 


514 


PR00153 


CYCLOPHILIN PEPTIDYL- 
PROIiYIi CIS -TRANS 
ISOMERASE SIGNATURE 


PR001S3C 11.01 7.188e- 
13 95-111 PR001S3E 
9.10 4.150e-12 122-138 


SIS 


BL0O740 


MAM domain proteins. 


BL00740A 13.87 7.188e- 
12 410-423 


516 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 6.087e- 
12 1018-1052 


517 


BL00242 


Integrins alpha chain 
proteins . 


BL00242C 16.86 8.320e- 
09 12-42 


523 


DMOOUol 


IMMUNOGLOBULIN V REGION.- 


UM0 003XA 16 . 60 3 - 75 Oe- 
39 20-68 DM00031B 
15.41 l.OOOe-25 84-118 


525 


BL0O319 


Amy 1 oi dogen i c 
glycoprotein 
extracellular domain 
proteins , 


BL0 0319C 17.12 8-3 75e- 
10 61-95 


526 


PF00789 


Donaain present in 
ubiquit in-regulatory 
proteins . 


PF00789B 19.70 3.308e- 
12 322-343 PF00789C 
20-98 S.269e-09 367- 
392 


528 


BL01162 


Quinone oxidoreductase / 
zeta-crystallin 
proteine . 


BL01162C 22.80 l.SOOe- 
16 120-164 
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SEQ ID NO: 


ACCESSION 
NO. 


; DESCRIPTION 


RESULTS* 


529 


PR00910 


LUTEOVIRUS 0RF6 PROTEIN 
SIGNATURE 


PR00910A 2.51 3. 8936- 
09 60-73 


532 


Bli0021S 


Mitochondrial energ/ 
transfer proteins. 


BL0D215A 15.82 4.000e- 
17 11-36 BL0021SA 
15.82 8.660e-li 123- 
148 


533 


BLOC 2 15 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 4.000e-" 
17 11-36 BL00215A 
15.82 a.660e-ll 97-122 


534 


BL0 0C98 


Thiolases acyl -enzyme 
intermediate proteins - 


EL00098C 21.65 2.800e- 
38 181-227 Brj00098B 
32.59 5.345e-3e 86-141 
BL00098D 26.30 8.364e- 

z*tD-*^oo x5JbQ0098E 
22.12 l,000e-34 314- 
352 BL00098F 10.18 
4.971e-22 365-386 
BL00098A 10.60 6.455e- 
11 38-50 


535 


PR00370 


FLAVIN- CONTAINING 
MONOOXYGENASE (FMO) 
SIGNATURE 


PR00370E 11.96 7.429e- 
22 321-340 PR00370D 
16.33 6.143e-21 185- 
204 PR00370F 17.75 
6.559e-21 376-396 
PR00370B 10.91 9.S91e- 
21 27-46 PR00370C 
12.72 3.500e-20 140- 
157 PR00370A 3.3S 
6.442e-17 4-20 


53& 


BIj00028 


Zinc f incff^Y* r*?RO f^\rt\t^ 
domain proteins. 


tJljUUU^o lb . u / 7»429e- 
16 285-302 BL00028 

3SB BL00O28 16.07 
1.346e-ll 369-386 
BLOC028 16.07 1.692e- 
11 397-414 BL00028 
16,07 4.4S2e-ll 453- 
470 BL00028 16.07 
7.231e-ll 425-442 
BIj00028 16,07 4.300e- 
10 313-330 


537 


BL00762 


WHEP-TRS domain 
proteins . 


BL00762A 23,43 9.419e- 
15 844-881 


538 


BL00762 


WHEP-TRS domain 
proteins . 


BL00762A 23,43 9.419e- 
15 819-856 


539 


Bl,00762 


WHEP-TRS doniain 
proteins . 


BL00762A 23.43 9.419e- 
15 822-859 


540 


PRO 09 8 5 


LEUCYL-TRNA SYNTHETASE 
SIGMATtJRE 


PR0O985A 12.10 9.000e- 
10 357-375 


541 


PU02102 


SUBUNIT E V-ATPASE 
VACUOLAR ATP SYNTHASE 
HYDROL. 


PD02102A 16.74 l.OOOe- 
40 3-47 PD02102B 
18.28 4,37Se-34 57-100 
PD02102D 21.69 1.923e- 
30 179-218 PD02102C 
26.34 a,929e-26 100- 
146 


543 


BL00028 


Zinc finger, C2H2 type, 
doTnain proteins. 


BL00028 16.07 l.OOOe- 
10 48-65 BL00028 
16.07 6.400e-10 193- 
210 BL00028 16.07 
l.OOae-09 343-360 
BL0002B 16.07 6.914e- 
09 78-95 


545 


BL00250 


TGF-beta family 
proteins. 


BL00250A 21.24 8.000c-~" 
31 293-329 BL002S0B 
27.37 5,286e-24 354- 
390 


547 


PR00319 


BETA G- PROTEIN | PR00319R 11.47 2.7l4e- ' 
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SEQ ID NO: 


ACCESSION 


DESCRIPTION 


RESULTS* 






(TRANSDUCIN) SIGNATURE 


09 106-201 PR00319A 
15.27 7.344e-09 210- 
227 


548 


BL01204 


NF- kappa -B/Rel/ dorsal 
domain proteins. 


RL01204A 17 . 74 1 . OOOe- 
40 8-56 BL01204D 
16.42 l.OOOe-40 177- 
221 BL01204E 13.83 
7.652e-30 225-250 
BL01204C 13.93 8.7146- 
22 141-160 BL01204B 
15.41 4 .333e-16 102- 
116 


54 9 


PR00326 


GTPl/OBG GTP- BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8. 75 8.3€4e- 
15 255-276 


551 


PF00632 


HECT-cioTnain (\i)3xc[ui.t.xn~* 
transferase) . 


PF00632C 20.66 3.302e- 
23 1569-1601 PF00632B 
18.45 3.700e-21 1515- 
1543 


554 


BL00290 


ItnmunoglobulxnE and 
major h.ist:ocompati.bd.I.it.y 
complex proteins. 


BL00290B 13-17 1.600e- 
14 187-205 BIj00290A 
20.89 2.0S9e-14 130- 
153 


557 


DM00215 


PROLINE-RICH PROTEIN 3 . 


ut'tW^XS X ? . O Di^J^CS — 

09 846-879 


559 


DM01111 


4 kw PHOSPHATASE 
TRANSFORMING 6lK PDFl . 


DMOllllL 11.93 3.762e- 
09 7-35 


562 


PF006S8 


Prtl ^ rf rf» n V 1 ^f*^ V^Hnr?"inrr 

protein, unique domain 
proteins . 


32 118-155 


564 


BL00141 


Eukaryotic and viral 
aBpartyl proteases 
proteins . 


BL00141A 12.10 4.1S0e- 


566 


PF008SS 


PWWP domain proteins . 


PF00855 13,75 S.667e- 
15 272-289 


567 


PD010S6 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING m. 


PD01066 19.43 4.977e- 
13 229-268 


569 


BL00107 


Protein kinases AT?- 
binding region proteins . 


iBL00107A 18.39 7 . OOOe- 
19 118-149 BL00107B 

199 


570 


BL00107 


Protein kinases ATP- 
binding region proteins . 


BL00107A 18.39 7 . OOOe- 
19 118-149 BLQ0107B 
13.31 5.S00e-15 183- 
199 


572 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 1.8S7e- 
34 454-483 PR00193C 
12.60 2,636e-31 223- 
251 PR00193B 11.69 
7.750e-29 171-197 
PR00193A 15.41 2.588e- 
22 115-135 PR00193E 
19.47 a.559e-19 508- 
537 


573 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


Pft00193D 14.36 1.857e- 
34 470-499 PR00193C 
12.60 2.636e-31 239- 
267 PR00193B 11.69 
7.750e-29 171-197 
PR00193A 15,41 2.588e- 
22 115-135 PR00193E 
19.47 6,559e*19 524- 
S53 


575 


BIi007S2 


XPA protein. 


BL00752B 19.17 S.703e- 
10 885-929 


576 


BL0003 0 


Eukaryotic RNA-binding 
region RNP-l proteins. 


BL00030A 14.39 7.000e- 
09 276-295 


577 


BLOOlie 


DMA polymerase family B 


BL00116A 12,81 5.737e- 
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SEQ ID NO: 








ACCESSION 
NO. 


DESCRIPTION 




578 




proteins. 


13 tib^-aVV BL00116B ' 

11.82 l.S29e-12 952- 
96S 


579 


B1j00I9S 




BLO0195B 15.31 7.158e- 
09 121-141 


580 


PR00019 


liEUCINE-RICH REPEAT""' 
SIGNATURE 


PR00019B 11.36 9.000e- " 
11 217-231 PR00019B 
11.36 l,360e-09 386- 
400 PR00019A 11.19 
3.333e-09 389-403 
PR00019B 11.36 8.920e- 
09 363-377 


583 


PR00253 

i 


GAMMA-AMINOBuryRlC ACID 
(GABA) RECEPTOR 
SIGNATURE 


PkU02b3A 9.15 2.125e- ~ 
25 275-296 PR002S3B 
13.47 7.923e-24 301- 
323 PR002S3D 16.68 
S.846e-23 444-465 
PR002S3C 13.85 2.241e- 
20 335-357 


584 






SKLECTIN yUPERFAMILy" 
COMPLEMENT-BINDING 
REPEAT SIGNATURE 


PR00343C 16.85 2.286e- 
11 1233-1252 PR00343C 
16.85 5.500e-ll 333- 
352 PR00343C 16.85 
S.SOOe-ll 783-802 
PR00343C 16.85 4.246e- 
10 1491-1510 PR00343C 
16.85 8.230e-10 1686- 
1705 


586 


DM01537 


SKl^W iiKi2 NUCLEOIAR 
HELICASE. 


DM01537B 21.63 1.878e- 
37 79-126 DM01537B 
21.53 9.491e-30 916- 
963 DM01537A 15.14 
3.l36e-ll 784-804 


587 


PFC00i3 


1 KH domain proteins 

family of RWA binding 
1 proteixis. 


PF00013 5.78 1.4S0e-O5 
124-136 


589 


DM00892 


3 RETROVIRAL, PROTEINA^K. 


Uiyi00892C 23.55 4.409©- 
13 262-296 


590 


B1.0 0478 


X*IM domain proteins. 


BL00478B 14.79 1.643e- 
13 261-276 BbO0478B 
14.79 7.709e-09 321- 
336 


591 


PF00855 


1 fwwp doniain proteins. 


VF0085S 13.75 S.OOOe- 
■Li> 931-948 


593 


Pt'OOSSs 


PWWP domain proteins . 


PF008SS 13.75 S.OOOe- 
IS 1062-1079 


594 


eF00628 


P 


HD- finger: 


PF00628 15.84 3,455e- 
12 424-439 


b96 


PR00205 


CADHERIN SIGMATURE 


PR0020SB 11.39 2.241e- 
16 558-576 PR0020SA 
14.73 9.308e-13 542- 
558 PR00205C 13.65 
5.304e-12 594-609 
PR00205B 11.39 4.273e- 
10 336-354 


b98 - 


BLOOIO? 


i Protein fclnases ATP- 
bindlng region proteins. 


UL00107A 18.39 4.789e- 
18 307-338 


600 1 


t^b01675 


GLYCOPROTEIN MAJOR 
ENVELOPE PROBABLE U3 . 


PD01675C 19,89 2.330e- 
10 55-39 




JL00242 ■ 


Integrins alpha chain 
proteins, 

I 
] 
1 
4 


dij00242E 9,03 9.591e- ' " 
27 985-1014 BL00242C 
1.6.86 4.1lSe-26 286- 
516 BL00242D 13.57 
l.lS0e-2S 357-382 
JL00242B 8.13 7.353e- 
.2 189-199 BL00242D 
.3.57 3.4SSe-ll 421- 
46 BL00242A 13.80 



210 



A 



wo 01/53312 



PCT/USOO/34263 



SEQ ID NO : 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








S.OOOe-11 61-73 
BLO0242D 13.57 4,986e- 
10 291-316 


601 


PR00320 


G" PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PRO0320A 16.74 5.610e- 
09 198-213 


602 


FR00278 


PANCREATIC HORMONE 
S IGNATURE 


PRO0278A 12.43 4.S69e- " 
10 331-348 


6 03 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins . 


BIjO0479C 12.01 3,250e- 
12 170-183 


604 


BL00315 


Dehydrins proteins. 


BL0031SA 9.35 1.672e- ■"' 
09 424-452 


60S 


BL0041S 


Synapsins proteins . 


BL0041SN 4.29 9.794e- " 
10 295-339 


606 


PRO 092 6 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926F 17.75 l.OOOe- 
13 335-358 


608 


PFOoass 


PWWP domain proteins. 


PF008S5 13,75 5.1^7e- 
15 265-2B2 


609 


PF00855 


PWWP domain proteins. 


PF00855 13.75 5.167e- 
IS 211-228 


612 


DM01206 


CORONAVIRUS NUCLEOCAPSID . 
PROTEIN. 


DM012 06B 10.69 7,4lle- 
10 877-897 DM01206B 
10.69 8.027e-10 861- 
881 DM01206B 10.69 
9,137e-10 873-893 
DM01206B 10:69 1.456e- 
09 859-879 DM012O6B 
10.69 1.797e-09 879- 
899 DM01206B 10.69 
4.076e-09 865-885 
DM01206B 10,69 7.038e- 
09 898-918 DM01206B 
10,69 7.949c-09 871- 
891 DM01206B 10.69 
8.29le-09 767-787 


61b 


PD02699 


PROTEIN DNA-BINDING 
BINDING DNA. 


PD02699A 8.91 2.023e- 
28 129-158 PD02699C 
24.84 l.OOOe-27 317- 
364 PD02699B 18.28 
l.OOOe-17 158-162 


616 


PR003 80 


KINBSIN HEAVY CHAIN 
SIGNATURE 


PROOSeOA 14.18 4.086e- 
22 288-310 PR0038OD 
9,93 3.721e-17 486-508 
PR00380B 12.64 2-241e- 
16 410-428 PR00380C 
13,18 2.976e-13 436- 
455' 


617 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PROO380A 14.18 4.086e~ 
22 288-310 PR00380D 
9.93 3.721e-17 486-508 
PR00380B 12.64 2.241e- 
16 410-428 PR00380C 
13.18 2.976e-l3 436- 
455 


618 


DM01206 


CORONAVIRUS NUCIiEOCAPSlD 
PROTEIN. 


I~>KT n 1'5P'<tH Irt ^ a c njiiwJ'"' 

L/nujL^\jx>a j.u*b!7 o.l43e- 
12 531-551 DM01206B 
10.69 2.603e-10 535- 
555 


621 


i*R007OO 


PROTEIN TYROSINE 

PHOSPHATASE SIGNATURE 


PR007COB 16.80 3 . 160e- 
21 561-532 


622 


BI*0Q239 


Receptor tyrosine kinase " 
class II proteins. 


BL0Q239F 28.15 3.222e- 
10 647-692 BL00239C 
18-75 e.304e-10 543- 
566 


623 


PR00407 


EUKARYOTIC MOLYBDOPTERIN' " 
DOMAIN SIGNATURE 


PR00407K 9.94 8.448e- 
09 326-339 


624 


BL00641 


Respiratory- chain NADH 
dehydrogenase 7S Kd 


BIj00641C 21.10 l.OOOe- 
40 157-202 BL00641E 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






subunit proteins. 


24.37 l.OOOe-40 255- 
308 BL00641F 33.12 
l,000e-40 571-623 
BL00641A 17.15 l.SlSe- 
37 48 - QQ BrjO0641B 
12.62 5.846e-34 113- 
139 BL00641D 13.23 
9 . 308e-29 216-240 


627 


PR00103 


CAMP-DEPENDENT PROTEIN 
KINASE SIGNATURE 


PR00103E 17.80 2,S00e- 
18 367-380 PR00103B 
13.39 2.080e-14 297- 
312 PR00103A 9.59 
2.957e-14 282-297 
PR00103D 10.83 3,077e- 
12 346-358 PR00103C 
15,68 l.OOOe-11 334- 
344 PR00103B 13.39 
1.450e-ll 175-190 
PR00103A 9.59 1.720e- 


630 


PR00081 


GIiUCOSE/RIBITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PROOO8IA 10,53 6.2lle- 
16 4-22 


631 


O ti r\ ^ c T 

JrFOOoSl 


BTB <aiXso known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 B.SOOe- 
14 37-50 


632 


DM01206 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 2.233e- 
10 1324-1344 DM01206B 
10.69 4.a22e-10 1276- 
1296 DM01206B 10,69 
7.o5ae-lU 13 28-1348 
DM01206B 10,69 8.274e- 

10.69 4.S32e-09 1320- 
1340 DM01206B 10.69 
7.266e-09 1326-1346 


^35 


3L00107 


binding region proteins. 


BL001Q7A Ifl 39 7 CDOe^- 

23 145-176 BriO0l07B 
13.31 2.636e-13 211- 
227 


636 


BL00657 


Fork head domain 
proteins . 


Bb00657A 19.39 1.545e- 
30 101-143 BL006S7B 
22.27 7.750e-26 149- 
192 


637 


BL00107 


Protein kinases ATP- 
binding region proteins . 


BIj00107B 13.31 l.OOOe- 
10 607-623 


643 


BL00018 


EF-hand calcium- binding 
domain proteins. 


BLOOOia 7.41 4.913e-09 
199-212 


647 


PF00628 


PHD-f inger . 


PF0062S 15.84 2.350e- 
13 385-400 PF00628 
15.84 3.455e-12 464- 
479 


648 


BI4OII29 


Hypothetical 
yabO/yceC/sfhB family 
proteino . 


BLQ1129E 13,25 4.000e- 
25 332-357 BL01129C 
25.56 8,200e-23 236- 
279 BIj01129B 12.51 
6.118e-13 191-212 


649 


BI1OI228 


Hypothetical cof family 
proteins . 


BL01228D 17.44 3.908e- 
10 455-480 


650 


BL00027 


' Homeobox * domain 
proteins. 


BI.00027 26.43 6.684e- 
13 771-814 


651 


BL50002 


Src homology 3 (SH3) 
domain proteins profile. 


BIi5C002A 14.19 1.750e- 
12 1026-1045 


653 


PR00253 


GAMMA-AMINOBUTYRIC ACID ^ 
(GABA) RECEPTOR 
SIGNATURE 


PR00253A 9.15 4.000e- 
24 253-274 PR00253C 
13.85 8.e00e-24 313- 
335 PR002S3B 13-47 
3.143e-22 279-301 
PR00253D 16.68 7.652e- 
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SEQ ID NO: 


ACCESSION 1 DESCRIPTION 
NO. 1 


RESULTS* 




I 


20 422-443 


654 


PD01719 


PRECURSOR GLYCOPROTEIN 
SIGNAL RE. 


PD01719A 12.89 4.452e- 
11 969-997 PD01719A 
12.89 3.961e-10 128- 
155 PD01719A 12.89 
7.395e-10 1276-1304 
PD01719A 12.69 1.222e- 
09 1220-1248 


657 


BL00354 


HMG-r and HMG-Y DNA- 
binding domain proteins 
(Ahook) , 


BL00354C 6.61 8.397e- 
09 563-578 


658 


BL003.S : 


HMG-r and HMG-V DNA- 
binding domain proteins 
(Ahook) . 


BL00354C 6.61 8.397e- 
09 580-S9S 


659 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM0Q215 19.43 2.174e- 
13 539-572 DM00215 
19.43 4.7S0e-12 549- 

582 DM00215 19.43 
9.824e-ll S51-S84 
DM00215 19.43 Z.329e- 
10 548-581 DM00215 
19.43 4.054e-lC DbO- 

583 DM00215 19.43 

DM0021S 19.43 7,107e- 


660 


PR006&8 


XYLOSE ISOMERASE 
SIGNATURE 


PR00688I 13.78 9.S18e- 
09 224-236 


663. 


BI.00027 


• Homeobox ' domain 
proteins. 


BL00027 26.43 S.950e- 
23 249-292 


662 


PR00360 


C2 DOMAIN SIGNATURE 


PROOBbUc XJ * bX / , X^oe** 
10 596-610 


663 


PRO 03 60 


C2 DOMAIN SIGNATURE 


PR00360B 13.61 7.1S8a- 

JLU Dyo-Di-U 


664 


PRO 03 60 


C2 DOMAIN SIGNATURE 


PR00360B 13.61 7.1S8e- 
10 596-610 


666 


PR00819 


CBXX/CFQX SUPERFAMILY 
SIGNATURE 


PR00819B 10.83 e,900e- 

1 n T A A — '70 fl 


667 


BD50040 


Elongation factor 1 
gamma chain profile. 


BLS0040C 22.62 2.143e- 
16 135-178 


668 


PRO 00 19 


LEUCINE-RICH REPEAT 
SIGNATURE 


09 139-153 PR00019A 
11.19 1.667e-09 94-106 
PR00019B 11.36 4.600e- 
09 163-177 


670 


BXjOOOIB 


EF-hand calcium-binding 
domain proteins. 


BL00018 7.41 3.2SCe-10 
€81-694 BLOOOia 7.41 
6,400e-10 717-730 


672 


PD00131 


ATP-BINDING TRANSPORT 
TRANSMEMBR. • 


PD00131B 34.97 l.OOOe- 
34 356-410 PD00131C 
19.59 1.346e-26 504- 
542 


673 


PR30667 


RETINAL PIGMENT 
EPiTHELIUM-RETrNAL GPCR 
SIGNATURE 


PR00667G 15.33 7.5S7e- 
10 106-123 


674 


PR00320 


G-PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320A 16.74 4-8S7e- 
13 593-608 PR00320B 
12.19 4.115e-12 635- 
650 PR00320C 13.01 
S.435e-ll 717-732 
PR00320C 13.01 2.800e- 
10 635-650 PR00320C 
13 .01 6.400e-10 593- 
608 PR00320B 12.19 
3.250e-09 593-608 


67S 


PR0032D 


G-PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320A 16.74 4.857e- 
13 572-587 PR00320B 
12.19 4-115e-12 614- 



4 
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SEQ ID NO: 



676 



680 



681 



667 



689 



651 



692 
€93 



ACCESSION 
NO. 



PRO 00 19 



PF00642 



PR0030e 



BI,00 019 



PR00700 



PR00049 



BL01024 



BL00027 



BL00211 



DESCRIPTION 



LEUCrNE-RICH REPEAT 
SIONATORE 



Zinc finger C-x8-c-x5-c- 
X3-H type (and similar) . 



TYPE I AiNTI FREEZE 
PROTEIN SIGNATURE 



Actinin-type actln- 
binding dom aia proteins. 
PROTEIN TYROSINE" 
PHOSPHATASE SIG NATURE 

WILM'S TUMOUR PROTEIN 

SIGNATURE 



Protein phospnatase 2A 
regulatory subunit PR55 
proteins . 



RESULTS* 



629 PK00320C 13 ,01 

8.43Se-ll 696-711 
PR00320C 13.01 2.800e- 
10 614-629 PR00320C 
13 .01 6 .4OOe-10 S72- 
587 PR00320B 12.19 
3.250e-09 572-587 



PR00019A 11. IS 9.667e- 
09 249-263 



PKQn642 11.59 3.700e- 
16 225-236 PF00642 
11.59 7.900e-12 187- 
198 



PR00308C 3 .83 8.754e- 

10 286-296 

BL00019D IS. 33 4.200e- 
19 227-257 



PR00700D 12.47 4.000e- 
09 99-118 



PR00049D 0.00 Q.SOOe- 
10 538-SS3 



* Horaeobox • 
proteins. 



domain 



ABC transporters family 
proteins. 



BL01024A 10,2^ l.OOOe- 
40 22-69 BL01024B 
S.91 l.OOOe-40 86-127 
BL01024C 7.80 l.OOOe- 
40 146-185 BL01024D 
13 .22 1 .OOOe-40 185- 
222 BL01024E 11.95 

I. O0Oe--40 222-266 
BL01024F 9,42 l.OOOe- 
40 266-317 BL01024G 

II. 09 l.OOOe-40 317- 
349 BIi01024H 13.88 
1.000e~40 389-442 



BL00027 26-43 8.071e- 
31 1S2-195 



BL00211A 12.23 S.OSOe- 
09 45-57 



694 

696" 



697 



698 



700 
701 



BL0021i 



BIjOQ741 
DM01930 



PR00869 



ABC transporters family 
proteins. 

ABC transporters family 
proteins. 



BL00211A 12.23 S.OSOe- 
09 45-57 



BL006 80 1 Methionine 

aminopeptidase subfamily 
1 proteins 



BIj00211A 12.23 S.OSOe- 
09 58-70 



BL0068O 14.37 S.304e- 
17 173-195 



Guanine -nucleotide 
I dissociation stimulators 
CPC24 family sign. 



BIi00741B 14.27 3,41Be- 
11 242-265 



2 kw FINGER SMCX SMCY 
YDR096W. 



DNA- POLYMERASE FAMILY X 
SIGNATURE 



UM0193OB 15.41 i:367e- 
37 170-215 DM01930F 
14,16 B.232e-28 267- 
303 DM0193 OB 19.86 
9.163e-10 37-71 



PR00869A 12,80 1.281e- 
16 245-263 



702 



BL00S23 



C2H2-TyPE ZINC FINGER 
SIGNATC7RE 



Sulfa tases protCLns. 



PR0004SA' 10.52 2,174e- 
10 77-91 PR00048A 
10.52 6.870e-10 133- 
147 PR0004 8A 10.52 
8.826e-10 105-119 
PR00048A 10.52 5.320e- 
09 161-175 



BL00S23E 19.27 2.S6Se- 
25 326-3S6 BL00S23A 
13,36 S.OSOe-16 38-55 
BL00S23B 8.64 S.909e- 
IS 86-98 BL00S23C 
12,64 5.500e-l3 137- 
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SEQ ID NOV" 
— 


ACCESSION 
NO- 


DESCRIPTION 


RESULTS 








148 BL00523D 9.89 
1.844e-ll 290-302 
BL00523G 9.46 5.500e- 
10 513-523 BLO0S23F 
10,85 6.3Sle-09 413- 
424 


703 


PR00048 


C2H2-TYPE 2INC FINGER 
SIGNATURE 


PR00048A 10.52 8.412e- 
12 376-390 PRO0O48B 
6.02 l.OOOe-10 334-344 
PR00048B 6.02 1.474e- 
09 364-374 


707 


PD00787 


SYNTHASE BIOSYNTHESIS 
TRANSFERASE. 


PD007G7A 14.84 6 . 941e- " 
14 66-82 


708 


PR00761 


BIND IN PRECURSOR " 
SIGNATURE 


PR00761E 14.32 8.S0Oe-~ 
10 822-841 


712 


DM0 13 54 


kw TRANSCRIPTASE REVERSE 
II 0RF2. 


DM01354Y 10,69 4.977e- 
38 425-465 DM01354X 
13.86 7.300e-34 376- 
415 DM01354V 12.97 
4.923e-17 311-358 
DM013S4W 12.64 5.S96e- 
10 356-376 


713 


BL00039 


DEAD-box subfamily ATP- 
dependent helicasea 
proteins . 


BL00039D 21,67 7.S4Se^^ — 
27 450-496 BL00039A 
18,44 2 537e-lfl 147- 
186 BL00039C 15.63 
2.2l6e-14 280-304 
BL00039B 19.19 1.947a- 
13 194-220 


715 


B1.00383 


Tyrosine specific 
protein phoGphatases 
proteins . 


BL00383E 10.35 4,981e- 
10 150-161 


717 


PF00777 


Sialyl transferase 
family. 


PF00777C 18.60 4-035e- 
21 106-161 


718 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031A 16,80 3.750e- 

39 20-68 DM00031B 
15.41 2.688e-28 84-118 
DM00031C 12,79 1.300e- 
12 131-142 


719 


BL0 0243 


Incegrins beta chain 
cysteine-rich dcxnain 
proteins. 


BL00243B 17.54 l.OOOe- 
40 131-172 BL00243C 
16.42 l.OOOe-40 172- 
208 BL00243D 24.07 
1.000e-40 222-274 
BL00243F 22,63 l.OOOe- 
40 314-358 BL00243I 
31.77 6,571e-39 607- 
650 BIj0a243E 16,70 
3 .077e-35 274-304 , 
BL00243G 21.38 3.62Se- 
34 358-400 BL00243H 
17.53 5.235e-29 567- 
593 BL00243A 17,61 
3.2506-21 63-84 
BL00243H 17.53 7.167e- 
16 477-503 BL30243H 
17.53 2.3046-11 524- 
550 BL00243H 17.53 
5.304e-ll 606-632 
BL00243I 31.77 1.380e- 
09 610-653 


720 




43 KD POSTSYNAPTIC 
PROTEIN SIQJATURE 


PR00217C 10,91 8.022e- " 
09 20-36 


722 


i'KOO704 


CALPAIN CYSTEINE 
PROTEASE {C2) FAI^ILY 
SIGNATURE 


PR00704D 11.05 S.909e- 
34 135-161 PR00704F 
13.61 7.0006-26 190- 
218 PR00704E 12.55 
8.07le-26 165-189 
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SEQ ID NO; 


ACCESSION 
WO. 


DESCRIPTION 


KKSULTS* 








PKO0704B 17.94 2.241e^^ 

23 75-98 PR00704A 
14.68 4.094e-19 30-54 
PR00704C 11.88 1.87le- 
18 99-116 


725 


PR00194 


TROPOMYOSIN SIGNATURE 


PKt)0194A 7.86 7.652e- 
09 169-187 


726 


PR00194 


TKUl-'OMXOaiN SIGNATURE 


i'R00194A 7.86 7.652e- 
09 169-187 


727 
731 


t\v KJ J ^ \J 


G- PROTEIN BETA WP-40 
REPEAT SIGNATURE 


^KUUJ^OC 13.01 2.12Se- " 
13 277-292 PR00320A 
16.74 l,310e-ll 277- 
292 PR0032OC 13,01 
4.522e-ll 323-338 
PR00320A 16.74 6.586e- 
11 323-338 PR00320B 
12.19 4.343e-10 323- 
338 PR00320B 12.19 
6.914e-10 277-292 




PRO 0195 


DYNAMIN SIGNATURE 


PR00195A 11,94 8.627e- 
16 288-307 PR00195E 
B,B2 3.912e-ll 457-474 


733 


PF0U642 


Zinc finger C-x8-C-x5-C- 
X3-H type (and similar) . 


PF00642 11.59 9,082e- 
10 787-798 


738 


BI,00039 


DEAD-box subfamily ATP- 
dependent heiicases 
proteins . 


BL00039A 18.44 2.565e- 
28 26-65 BL00039D 
21.67 2.105e-20 338- 
384 EL00039C 15,63 
9.100e-13 160-184 
BL00039B 19.19 9,617e- 
11 73-99 


739 


BL01289 


TSC-22 / dip / bun 
family proteins. 


BL01289A 12,18 8.909e- '" 
31 326-353 BL01289B 
10.45 9.571e-17 353- 
3 83 


742 


BL01019 


ADP-ribosylacion factors 
family proteins. 


BL01019A 13.20 7.07Se: — 
12 41-ai 


743 
74 7 


BL00965 


Phosphomannose isomerase 
type I proteins. 


BL0096SC 23. 7B l.OOOe- 
40 256-305 BL00965B 
17.77 1.600e-25 126- 
153 BL0D96SA 10.57 
6,40pe-19 94-113 




BLOOOai 


Krincfie cloinain proteins. 


BIj00021D 24.56 4.563e- 
25 231-273 BL00021B 
13.33 5.34Se-2l 60-78 


748 


BX.00612- 


Osteonectin domain 
proteins , 


BL00612B 11,35 2.034e- 
11 93-126 


749 


PRO 04 SO 


RECOVERIN FAMILY 
SIGNATURE 


FR004SOC 12.22 6.8806- ' 
J-t) 135-157 


752 


BL1OO795 


involucrin proteins . 


BL0079SC 17.06 6.000e- 
11 384-429 BL00795C 
17.06 9.444e-ll 370- 
415 


754 
755 


BL00051 


Ribosomai protein L39e 
proteins. 


BLOOOSl 20. -92 1.935e- 
16 4-SO 




UM01970 


0 kw 2K632.I2 yDR313C 
ENDOSOMAL III. 


tJM01970B 8.60 7.723e- 
09 171-184 


760 


BL01020 


SARI family proteins. 


BL1DIO2OC 15.35 9,020e- 
12 99-150 


762 


3Ij0004 6 


rtistone H2A proteins. 


tiJL.00046 12.95 l.OOOe- 
40 33-88 


763 


t>Ll02411 

] 


mOTElN TRANSCRIPTION 
REGULATION NUCLEAR. 


Pj3b2411 21,89 9.13'7e- 
10 206-240 


764 i 


iij00027 

I 


Home obox ' doma i n 
proteins. 


bL00027 26.43 B.eOOe- 
29 417-460 


767 "i 


ilj01208 ■■" \ 


/Wi-u domain proteins. 


SLO120aB 15.83 6.063e- 
10 309-324 BL01208B 
L5.83 8,031e-10 165- 



^1 
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SEQ ID NO: 


ACCESSION 
NO, 


DESCRIPTION 


RESULTS* 


770 






ISO BL01208B 15.83 
4.X62e-09 85-100 




BL00031 


Nuclear hormones 
receptors DNA-bindinq- 
region proteins . 


DliO0O31A 19.55 9.57le- 
32 -206-241 Bti00031B 
22.25 5.500e-27 242- 
274 


772 


PRO0449 


TRANSFORMING PROTEIN P2i 
RAS SIGNATURE 


PR00449A 13.20 l,4S0e- 
18 4-26 PR00449E 
13.50 3.520e-14 142- 
165 PR00449C 17,27 
3.032e-13 44-67 
PR00449D 10.79 8.S79e- 
13 107-121 PR00449B 
14.34 3.4SSe-ll 27-44 


773 




Sultatases proteins. 


BL00S23E 19.27 9.333e- 
23 299-329 BD00S23A 
13,36 2.200e-13 47-64 
BL00S23B 8.64 2.607e- 
13 91-103 BI,00523D 
9.89 7.923e-12 224-236 
BLC0S23C 12,64 4,512e- 
10 141-152 BL00S23F 
10.85 5.821e-10 373- 
384 


775 


BL00028 


Zinc finger, C2H2 type, 
domain proteins . 


BL00028 16.07 7.686e- 
09 568-585 


776 


BL00028 


Zinc finger, C2H2 type, 
domain proteins , 


BL00028 16.07 7.686e- 
09 621-638 


111 


BIj00028 


Zinc finger, C2H2 type, 
domain proteins . 


BI,00028 16.07 7.6a6e- 
09 595-612 


778 


BL00030 


Eukaryotic KWA-bindlng ' 
region RWP-l proteins. 


BL00030A 14.39 8.4l2e- 
11 322-341 Bl,00O30A 
14.39 7.000e-10 220- 
239 


779 
781 


PR0O079 


GLUCOSE - 6 - PHCS PHATE 
DEHYDROGENASE SIGNATURE 


PR00079B 12.98 2,929e- 
26 193-222 PR00079E 
16.65 4.1S0e-23 348- 
375 PR00079C 8.68 
6.351e-16 246-264 
PR00079D 13.51 7.070e- 
16 264-281 PR00079A 
16.12 6.769e-13 169- 
183 - 




BIj00215 


Mi tociiondricii energy 
transfer proteins. 


BLi00215A 15.82 9.250e- 
17 10-35 BU)Q21SA 
15.82 6.00Qe-16 221- 
246 BL0021SA 15.82 
7.8S7e-12 108-133 
BIj00215B 10.44 9.526e- 
11 168-181 


783 


PD00239 


PROTEIN <?H2 r>OMaTM 
REPEAT PRESYNA. 


PD00289 9.97 6.276e-09 
159-173 


785 
786 


BIi0O690 


dependent hellcasee 
proteins. 


BI,00S90B 13.38 l.OOOe- 
12 147-165 BL00690A 
6,87 S,320e-10 114-124 
BL00690C 7.51 3.189e- 
09 218-228 




PR00449 


TRANSFORMING PROTEIN P21 

RAS SIGNATURE 


PR00449C 17,27 8.500e- 
16 50-73 PR00449A 
13.20 5.23Se-14 8-30 
PR00449E 13.50 2.853e- 
11 150-173 PRO0449D 
10,79 l.S45e-09 111- 
125 


788 I 


3MOi2 06 ( 
I 


::ORONAVIRUS NUCLEOCAPSID 
PROTEIN . 


UM01206B 10.^9 8.767e^: 

10 1-21 


790 E 


iL00 915 1 


^hosphatidylinosAtoi 3- ] 
md 4 'kinases proteins 


BXi009l5C 22.43 9.182e- 
39 725-764 BL00915B 
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SEQ ID NO: 


ACCESSION 
WO. 


DESCRIPTION 


RESULTS* 








22.78 5.050e-33 633- 
671 BL0091SD 27.02 
1.529e-21 795-831 
BL0091SA 10.09 l.OOOe- 
13 395-407 


791 


PR0020B 

/ 


GLIADIN AND LMW GLUTENIN 
SUPERFAMILY SIGMATURE 


PR00208A 12.59 6.294e- 
10 120-138 PR00208A 
12.59 6.294e-10 121- 
139 PR002O8A 12.59 
6.294e-10 122-140 
PR00208A 12.59 6,294e- 
10 123-141 PRa020BA 
12,59 6.294e-10 124- 
142 PR0020aA 12.59 

PR00208A 12.59 6.294e- 
10 126-144 PR00208A 
12.59 6,294e-10 127- 
14B PR00208A 12.59 
5.2946-10 128-146 
PR00208A 12.59 6.254e- 
10 129-147 PR00208A 
12.59 7.411e-09 130- 
148 PR00208A 12.59 
7.658e-09 131-149 
PR002'08A 12.59 7.904e- 
09 132-lSO PR00208A 
12,59 8,274g-D9 118- 
136 PRO02O8A 12.59 
o.ii/^e-uy ny-j.j/ 


795 


PR00205 


CADHERIN SIGNATURE- 


PR00205B 11.39 5.034e- 

14.73 1.2576-11 284- 
300 PR00205C 13.65 
1.333e-ll 337-352 


796 


BL00412 


Neuromodul in (GAP -43) 
proteins . 


BL00412D 16.54 4.000c- 
12 196-247 BL00412D 
16,54 5.705e-ll 197- 
248 BL00412D 16.54 
7.848e-10 199-250 
BL00412D 16.54 l.B27e- 
09 195-246 BL00412D 
16.54 1.9l8e-09 194- 
245 BL00412D 16.54 
2.102e-09 201-252 


797 


BI*00021 


Kringle domain proteins. 


EL00021B 13.33 6.339e- 
13 40-58 


799 


Bl»01052 


Calponin family repeat 
proteins. 


BL01052C 18.51 l.OOOe- 
40 87-127 BL010S2A 
16.12 l.S29e-32 3-35 
BL01052B 15.31 1.257e- 
25 52-78 BL01052D 
10,26 5,737e-2S 174- 
194 


800 


BL00348 


p53 tumor antigen 
proteins. 


BL00348F 23.19 3 . 714e- 
09 197-240 


BOl 


DI*00309 


Vertebrate galactoside- 
binding lectin proteins. 


BL003Q9G 18.65 1.621e- 
09 62-87 


802 


PRO 024 5 


OliFACTORY RECEPTOR 
SIGNATURE 


PR00245D 10.47 5.224e- 
09 187-199 


804 


PP00774 


Dihydropyridine 
sensitive Ij-type calcium 
channel (Beta subuni. , 


PP00774A 16.47 e,457e- 
10 110-156 


808 


PRO 0667 


RETINAL PIGMENT 
EPITHELIUM-RETINAL GPCR 
SIGNATURE 


PR00667C 11.71 9.875e- 
09 12-28 


810 


PD02346 


PHOTOSYSTEM II PROTEIN 
PRECURSOR 


PD02346F 12.89 4.340e- 
09 317-354 



218 



A 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






PHOTOSYN'THESIS . 




811 


BLO0685 


CBF-A/NF-YB subunit 
proteins . 


BLOOeaSB 14.41 6.779e- 
14 54-95 BL0068SA 
11.22 4.798e-l3 S-S4 


812 


PRO0O8O 


aijCOhol dehydrogenase 
superfamiiiy signature 


PROOOBOA 9.32 9.4196- 
10 93-105 


813 


BL003S7 


Histoae H2B proteins. 


BL00357 7.74 1.988e-17 
22-65 


815 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BIND I. 


PD00066 13.92 7.923e- 
15 158-171 PD00066 
13.92 S.200e-14 46-59 
PD00066 13.92 7.000e- 
•14 18-31 PD00066 
13.92 7.0OOe-l3 130- 
143 PD00066 13.92 
7.S00e-13 214-227 
PD00066 13.92 9.000e- 
13 102-115 PD00066 
13.92 4.429e-12 186- 
199 PD00066 13.92 
1.783e-ll 74-87 


816 


BL01195 


Pep t idyl "tRNA hydrolase 
proteins . 


BL01195C 20.12 3.348e- 
20 100-139 


820 


BLC0S20 


Interletlkin-10 family 
proteins - 


BL00520A 61.21 6.47le- 
09 1-14 


822 


BL00972 


Ubiquitin carboxyl- 
tertninal hydrolases 
family 2 proteins . 


BL00972A 11.93 8.113e- 
09 224-242 


825 


PR00876 


NEMATODE METAbLOTHIONEIN 
SIGNATURE 


PR00876B 7.66 2.2686- 
10 101-115 


829 


PD02855 


FIoAVOPROTEIN PROTEIN 
DNA/PANTOTHEN. 


PD02855A 18.37 4.732c- 
28 88-124 PD028SSB 
8.36 6.47ee-09 132-142 


830 


PR00405 


HXV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405B 11.83 7,000e- 
21 44-62 PR00405C 
19.41 l.OOOe-13 65-87 
PR00405A 17.71 7.2830- 
13 25-45 


831 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 l.OOOe- 
09 47-61 PR00019B 
11.36 1.720e-09 136- 
150 PR00019B 11.36 
3.880e-09 44-58 


832 


PROOOll 


TYPE III EGF-LIKE 
SIGNATURE 


PROOOllB 13.08 3.438e- 
16 164-183 PROOOllD 
14.03 6.850e-16 164- 
183 PROOOllA 14.06 
8.364e-14 164-183 
PROOOIIC 24.25 5.415e- 
12 231-260 PROOOllD 
14.03 9.8S2e-ll 2l2- 
231 


834 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10,26 7.000e- 
12 232-246 


835 


PD00306 


trt\\JXCtX£i \3laS.\JUt:t\\J LCiXvt 

PRECURSOR RB- 


PD00306A 10.26 4,000e- 
10 290-304 


836 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 7.000e- 
12 216-230 


837 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 3.898©- 
09 78-111 


839 


PD02784 


PROTEIN NUCLEAR 
RIBONUCLEOPROTEIN . 


PD02784B 26.46 8.3 02e- 
09 73-116 


840 


PRO 0700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700B 16.80 5.091e- 
22 369-390 PR00700D 
12.47 S.76Se-21 491- 
510 PR00700C 13.17 
4.7S0e-14 44S-467 
PR00700P 11.18 8,300e- | 
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SEQ ID NO: 


ACCESSION 
NO . 


DESCRIPTION 


RESULTS* 








11 538-549 PR00700E 
17.57 3.100e-10 522- 
S38 


841 


PROOl.09 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 5.404a- 
13 134-153 


B44 


PD02785 


PROTEIN RIBOSOMAL 60S 
L22 RNA-BINDING HEP. 


PD02785B 14.43 l.OOOe- 
40 58-112 PD02785A 
15.23 1.91Se-28 8-57 


845 


BLC0826 


MARCKS family proteins. 


BL00826C 7.63 6.738e- 
09 203-230 


846 


BL00S18 


Zinc tinger, C3HC4 type 
(RING finger), proteins. 


BL00S18 12.23 4,429e- 
10 15-24 


849 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 l.OOOe- 
08 340-349 


850 


PR00308 


tYPfi I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308A 5.90 6.S06e- 
09 12-27 


851 


PD02411 


PROTEIN TRANSCRIPTION 
REGULATION NUCLEAR. 


PD02411 21.89 V.OOOe- 
16 246-280 


852 


BL00420 


Speract receptor repeat 
proteins domain 
proteins . 


BLQ0420B 22.67 l.OOOe- 
40 723-778 BLOO420B 
22.67 1.321e-38 933- 
988 BL00420B 22.67 
8.4S7e-28 482-537 
BL00420B 22.67 4.500e- 
27 587-642 BL00420B 
22,67 9-62Se-27 270- 
325 BL00420B 22.67 
4.205e-26 163-218 
BL00420B 22.67 5.731e- 
23 55-110 BL00420B 
22.67 6.464e-20 377- 
432 BL0O42pB 22,67 
2,a00e-15 830-885 
BL00420C 11.90 1.900e- 
13 355-366 BL00420C 
11,90 1.900e-l2 808- 
819 BL00420C 11,90 
3.550e-12 248-259 
BL00420C 11.90 2.831e- 
11 141-152 BL00420C 
11.90 5.119e-ll 1018- 
1029 BL00420C 11.90 
7.955e-10 567-578 


853 


BL00420 


Speract receptor repeat 
proteins domain 
proteins . 


BL00420B 22.67 l.OOOe- 
40 756-811 BL00420B 
32.67 1.321e-38 966- 
1021 BL00420B 22.67 
a.457e-2B 482-537 
BL00420B 22,67 4,500e- 
27 620-675 BL00420B 
22.67 9.62Se-27 270- 
325 BL00420B 22.67 
4.205e-26 163-218 
BL00420B 22.67 5.731e- 
23 55-110 BL00420B 
22.67 6.464e-20 377- 
432 BL00420B 22.67 
2.800e-15 863-918 
BIiO0420C 11.90 1.900e- 
13 355-366 BL00420C 
11.90 1.9006-12 841- 
852 BL00420C 11.90 
3.550e-12 248-259 
BL00420C 11.90 2.83le- 
11 141-152 BL00420C 
11.90 5.1l9e-ll 1051- 
1062 BL00420C 11,90 
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SEO ID NO: 


ACCESSION 
WO. 


DESCRIPTION 


RESULTS* 








7.95Se-10 567-578 


857 


PRG0388 


3 • , 5 • -CYCLIC NUCLEOTIDE 
CLASS II 

PHOS PHOD r ESTERASE 
SIGNATURE 


PR0Q368A 10.45 2,778e- 
09 64-83 


859 


BL00030 


Eukaryotic KNA-binding 
region RNP-1 proteins. 


BLOOO30A 14,39 2.929e- 
13 37-56 BL00030B 
7.03 1.900e-ll 167-177 
BL0003 0A 14.3 9 2.000e- 
10 128-147 


861 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 4.2S0e- 
17 23-41 PR00988C 
13.64 a,714e-l6 107- 
123 PROOgaSF 12.23 
7.828e-lS 198-212 
PR00988E 8,27 9.769e- 
12 176-188 PR009e8D 
S.9S 8.250e-ll 163-174 
PR00988B 11.60 4.512e- 
10 60-72 


863 


BL00215 


Mitochondrial energy 
transfer proteina. 


BL0021SB 10.44 8-071e- 
12 41-S4 


864 


PR0077S 


90 KD HEAT SHOCK PROTEIN 
SIGNATURE 


PR00775E 8.06 l.OOOe- 
24 198-221 PR00775B 
3.52 1.837e-23 107-130 
PR0077SD 8.91 4.484e- 
17 171-189 PR0077SA 
9,90 8,342e-l7 86-107 
PR00775C 10.68 9.3 79e- 
17 153-171 PR00775G 
10.64 6.8S0e-15 267- 
286 PR00775F 12.76 
6.7696-14 249-267 


866 


DM01688 


2 POLY-IG RECEPTOR. 


DM01688G 16.45 9.460e- 
09 89-121 


867 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PP01066 19.43 5,596e- 
29 14-53 


868 


BL01287 


RNA 3 '-terminal 
phosphate cyclase 
proteins . 


BL01287A 17.95 2.688e- 
26 16-48 


869 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM0021S 19.43 6,464e- 
10 304-337 


872 


81.0004 6 


HiBtone H2A proteins. 


BL00046 12.95 l.OOOe- 
40 30-85 


874 


OXJ V vl -J. o o 


Biot in- re quiring enzymes 
attachment site 


BL00188 30.29 9,036e- 
32 665-711 


876 


BL0002B 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 7.6a6e- 
09 298-315 


877 


PD02102 


SUBUNIT E V-ATPASE 
VACUOIiAR ATP SYNTHASE 
HYDROL . 


PD02102A 16.74 4.176e- 
10 97-141 


879 


BL01189 


Ribosomal protein S12e 
proteins. 


BL01189A 14.27 l.OOOe- 
40 35-71 BL011S9B 
13.49 1.000e-40 71-125 


882 


BL00284 


Serpins proteins. 


BL002e4C 28.56 6-400e- 
25 62-104 BL00284B 
17.99 6,182e-12 35-56 


889 


BL00216 


Sugar transport 
proteins. 


BL00216B 27.64 4.375e- 
21 35-85 


896 


PRO0391 


PHOS PHATIDYL INOSITOL 
TRANSFER PROTEIN 
SIGNATURE 


PR00391E 12.50 7.785e- 
15 211-231 PR0a391B 
8.39 l.OOOe-13 83-104 
PR00391D 12.21 9.328e- 
13 191-207 PR00391A 
7.83 5.390e-ll 16-36 


897 


PR0Q327 


ICE NUCLEATION PROTEIN 


PR00327C 6.37 5.247e- 



1 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 
SIGNATURE 


RESULTS* 


898 


BL00039 


DEAD- box subfamily ATP- 
dependenc hel leases 
proteins . 


09 313-328 

BLO0O39D 21.67 7.800e- 
26 386-432 BL00039A 
18.44 6-G74e-16 113- 
1S2 BL00039B 19.19 
1.947e-13 153-179 
BLO0O39C 15,63 9.460e- 
11 236-260 


901 


PD00066 


PROTEIN ZINC- FINGER 
METAL-BINDI. 


PD00O66 i3.92 8.200e- 
16 254-267 PD0Q066 
13.92 8-200e-16 282- 
295 PD00066 13,92 
8.200e-16 310-323 
PD00066 13.92 e.200e- 
16 366-379 PD00066 
13.92 8.200e-16 394- 
407 PD00066 13.92 
8.200e-14 338-351 


902 


Blooms 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 9 . 321e- 
11 6-SO 


903 


PR0Od06 


VINCULIN SIGNATURE 


PR00806B 4.26 9.i60e- 
09 97-111 


904 


PR00381 


KXNESIN LIGHT CHAIN 
SIGNATURE 


PR003S1E 8,75 6.5B6e- 
25 335-356 PR00381B 
IB. 17 2,€67e-24 204- 
224 PR00381A 9.55 
2 . 800e-24 107-1 9*; 
PR00381C 12.48 4.522e- 
24 226-245 PR00381D 
13.94 1.084e-22 291- 
309 PR00381F 9.13 
3.288e-22 370-392 
PR00381F 9.13 7.181e- 
13 286-308 PR00381B 
8.73 4.066e-ll 251-272 
PR00381E 8,75 7.033e- 
11 293-314 PR00381E 
8.7S 8,3646-10 377-398 
PR00381D 13.94 5.230e- 
09 333-351 PR00381C 
12.48 7,120e-09 310- 
329 


906 


PR00345 


STATHMIN FAMILY 
SIGNATURE 


PR0034SC 4.54 8.557e- 
09 525-549 


907 


PPDrt'^ai? ' — """" 


STATHMIN FAMILY 
SIGNATURE 


PR0034SC 4.54 8.S57e- 
09 513-537 


908 


BL0067S ~ " 


Trp - Asp ( WD ) repea t 
proteins proteins . 


BL00678 9.67 9,308e-ll 
144-155 


9X0 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BlNDlNG NU. 


PD01066 19,43 2.800e- 
30 48-87 


912 
^22 


BL01104 


Ribosomal protein L13e 
proteins. 


BL0H04C 15.14 6,000e- 
09 364-392 




HL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BLO 0678 9 67 "i" ftAoo TK 5"""" 
500-511 


923 


fR00 320 


G- PROTEIN BETA WD-4 0 
REPEAT SIGNATURE 


PR0032OC 13.01 2.500e- 
09 323-338 PR00320C 
13.01 5.S0Oe-09 187- 
202 


924 


PD02iei 


PROTOCHLpROPHYLLIDB 
REDUCTASE PHOTOSYNT. 


PD02181D 12.85 8,609e- 
09 36-S4 


926 

928 1 ] 


bb6ooi9 

3L00678 


Actinin-type actin- 
binding domain proteins. 

rrp-Asp (WD) repeat 


BL00019C 14.66 "7:453e- 
25 108-144 BL00019B 
13.34 e.SlOc-ll 61-84 
BL0OOL9D 15,33 9.338e- 
11 205-235 BL00019A 
12.56 2.373e-10 34-45 
BL00678 9.67 9.308e-ir 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCHIPTIOW 


RESULTS* 






proteins proteins. 


273-284 BL.00678 9,67 
l-600e-10 314-325 
BL00678 9.67 7.600e-10 
360-371 BIi00678 9.67 
8.S79e-09 206-217 


929 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL0OS18 12,23 i,857e- 
10 137-146 


930 


BliOlOSS 


Riisulose-phosphate 3 - 
epiraerase family 
proteins , 


BL01085D 16,55 4.600e- 
24 134-165 BL010B5B 
10.15 S.680e-22 30-52 
BL01085E 18.87 8-676e- 
20 172-202 BL01085C 
21.81 2.038e-14 66-97 


931 


BL01085 


Ribulose -phosphate 3- 
epimerase family 
proteins . 


BLC1085D 16.55 4.600e- 
24 152-183 BL0108SB 
10.15 5.6e0e-22 30-52 
BL01085E 18.87 8.676e- 
20 190-220 BL0108SC 
21.81 2.038e-14 66-97 


933 


PD00301 


PROTEIW REPEAT MUSCLE 
CALCIUM-BI. 


PD00301A 10-24 6.400e- 
09 160-171 


■936 


PP00168 


02 domain proteins. 


PF00166C 27.49 4.000e- 
12 336-362 


937 


BL00415 


Synapsins proteins . 


BLi00415N 4.29 9.5X9e- 
10 B-49 


94 0 


PR00862 


PROLYI. OlilGOPEPTIDASE 
SERINE PROTEASE (S9A) 
SIGNATURE 


PR00e62D 16.17 4.0S6e- 
09 63-84 


945 


BL01230 


RNA methyl transferase 
trmA family proteins. 


BL01230B 11,62 2.3736- 
09 407-420 


94 8 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins . 


BL00479B 12.57 7.429e- 
18 52-68 BI.00479A 
19.86 2.200e-13 26-49 


949 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 1.474e-09 
100-111 


954 


PD01311 


PROTEIN OXIDOREDUCTASE 
WAD INTERGENIC RE. 


PD01311A 30.23 S.909e- 
10 66-111 


955 


PF00651 


BTB (also Jtnown as BR- 
C/Ttk) domain proteins. 


PF006S1 15.00 3,2S0e- 
12 47-60 


9S6 


PF006S1 


BTB (also known as BR- 
C/Ttlc) domain proteins. 


PF00651 15.00 3.250e- 
12 47-60 


957 


BL00379 


CDP- alcohol 

phosphatidyl transferases 
proteins. 


BU00379 24. S4 l.^lOe- 
15 111-14 8 


959 


BL01115 


GTP- binding nuclear 
protein ran proteins. 


BL01115A 10.22 1.884e- 
10 31-75 


960 


BL01115 


OTP -binding nuclear 
protein ran proteins. 


BL01115A 10.22 3,438e- 
14 110-154 


962 


BL00061 


Short -chain 

dehydrogenases /reductase 
s family proteins. 


BL00061B 25.79 6.S86e- 
13 198-236 


963 


PRO 05 02 


MUTT DOMAIN SIGNATURE 


PR00502A 15.06 8.200e- 
11 210-225 


966 


PRO 03 08 


TYPE I ANTIFREEZE 
Je*KOTEIN SIGNATURE 


PR0030aA 5.90 7.035e- 
09 55-70 


967 


DM01206 


CORONAVIRUS NUCLEOCAPS ID 
PROTEIN. 


DM01206B 10.69 1.286e- 
12 104-124 DM01206B 
10.69 5.299e-ll 23-43 
DMQ120SB 10.69 8.274e- 
10 73-93 DM01206B 
10.69 3.962e-09 108- 
128 DM01206B 10-69 
5,671e-09 38-58 


969 


PP01008 


Initiation factor 2 
subunit . 


PFOlOOeB 25.59 4.724e- 
31 417-460 PF01OO8C 
12.25 5.333e-18 506- 
526 PF01008A 20.14 
5.875e-lS 369-390 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


970 


BL01277 


Ribonucieaee PH 
proteins . 


BL01277C 10.18 7.648e-~ 
10 112-143 BL01277A 
17,39 9.806e-10 40-78 


975 


BL01iS9 


WW/rspS/wwp domain 
proteins . 


BL01159 13.85 3-605e- 
12 130-145 BL011S9 
13.85 4.122e-10 171- 
186 


977 


PF00791 


Domain present in ZO-1 
and UncS-like netrin 
receptors . 


PF00791C 20.98 2.235e- 
09 55-94 


978 


BL01167 


Ribosomal protein Ll7 
proteins . 


BL01167B 20.66 8.258e- 
19 88-127 


979 


BL00478 


LIM domain proteins. 


BL0047eB 14.79 9.357e- 
13 33-48 BL0047SB 
14.79 7.250e-12 98-113 


980 


PR00312 


CALSEQUESTRIN SIGNATURE 


PR00312E 8.32 3.4 23e- 
36 169-199 PR00312I 
15.78 5.2a6e-35 332- 
361 PRQ0312F 16.06 
5 . e65e-35 199-229 
PR00312H 13.31 8.313e- 
35 263-291 PR00312a 
13.73 5.688e"34 363- 
392 PR00312D 9.43 
2.636e-33 128-158 
PR00312C 15.14 8.639e- 
33 92-122 PR00312B 
15.08 e.941e-33 62-92 
PR00312G 11.11 6.657e- 
32 230-258 PR0O312A 
11.70 6.914e-27 35-59 


981 


PF00992 


Troponin . 


PF00992A 16.67 8.816e- 
09 414-449 


982 


PRO 0299 


ALPHA CRYSTALIiIN 
SIGNATORE 


PR00299F 13.20 2.367e- 
09 127-149 


983 


BL01150 


Respiratory- chain NADH 
dehydrogenase 20 Kd 
subunit proteins. 


BL01150B 17.16 l.OOOe- 
40 156-202 BL01150A 
14.10 8.200e-39 100- 
138 


986 


BL0079S 


Involucrin proteins - 


BL0079SC 17.06 7.211e- 
14 4-49 BL00795C 
17.06 1.778e-ll 1-46 
BL00795C 17.06 3.407e- 
10 14-59 BL0079SC 
17.06 7.a02e-10 2-47 
BL00795C 17.06 a.640e- 
10 19-64 BL00795C 
17.06 7.400e-09 11-56 
BL00795C 17.06 7.800e- 
09 3-48 


987 


BL0093 9 


Ribosotnal protein Lie 
proteins. 


BL00939F 17.27 5.393e- 
09 810-840 


988 


PR0G452 


SH3 DOMAIN SIGNATORE 


PR00452B 11.65 6.538e- 
11 525-541 


989 


PR00452 


SH3 DOMAIN SIGNATORE 


PR00452B 11.65 6.538e- 
11 497-513 


994 


BL00027 


' HoTueobox ' domain 
proteins. 


BL00027 26.43 2.500e- 
25 146-189 


997 


BL013 04 


ubiH/COQS monooxygenase 
family proteins. 


BL01304A 8.05 3.893e- 
11 65-79 


998 


DM01767 


5 TRANSMITTER DOMAIN. 


DM01767B 10.07 7.868e- 
09 22-39 


1000 


PR00926 


MITOCHONDRXAI, CARRIER ^ 
PROTEIN SIGNATURE 


PR00926C 16,07 1.7S0e- 
24 73-94 PR0092eD 
10.53 3.2S0e-23 126- 
145 PR0O926F 17,75 
6.211e-23 217-240 
PRG0926E 11.70 6.62Se- 
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SEQ ID NO: 


ACCESSION 
WO. 


DESCRIPTION 


RESULTS* ■ 








20 174-193 PR00926B 
16,07 2.125e-18 24-39 
PR00926A 10.41 l.OOCe- 
15 11-25 PR00926F 
17.75 5.S65e-09 120- 
143 


1005 


RT.ri n d c\t^ 


Actins proteins. 


BL0O40faB 5,47 l.OOOe- 
40 88-143 BL0a406C 
€-75 l.OOOe-40 147-202 
BL00406D 12.58 3.700e- 
40 270-325 BL00406E 
8.44 7.375e-38 327-377 
BL00406A 9,95 3.348e- 
29 11-46 


1006 


BL00406 


Actins proteins. 


BLQ0406B 5.47 l.ODOe- 

40 88-143 BL00406C 
6.75 l.OOOe-40 147-202 
BL0O406E 8.44 l.OOOe- 
35 248-298 BL00406A 
9.9S 3.34ae-29 11-46 


100.7 


PRO 0 3 04 


taille:5s complex 
' polypeptide 1 
\ (chaperone) signature 


PR00304D 11.04 8.714e- 
22 38-1-407 PR00304C 
8.69 4.667e-20 98-118 
PR003C4B 11.60 7,577e- 
19 68-87 PR0C304A 
9.20 3,382e-16 46-63 
PR00304E 7.79 6.870e- 
13 418-431 


1009 


4*001066 


PROTEIN ZINC FINGER 
ZINC- FINGER MET7VL- 
BINDING NU, 


PD01O66 19.43 2,929e- 
32 9-48 


XOll 


i'-U01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING Nil. 


PD01066 :.9.43 2.929e- 
32 68-107 


1012 


BLOOSIB 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BLOOSia 12.23 6.143e- 
10 64-73 


1016 


PD0116B 


SYNTHETASE LIGASE 
PROTEIN AIiANYL. 


PD01168H 12.08 l.OOOe- 
11 174-194 


1018 


PD00930 


PROTEIN GTPASE DOMAIN 

At* X JL VAX iOW . 


PD00930B 33.72 1.391e- 
32 261-302 PD00930A 
25.62 9,SS0e-22 157- 
183 


1022 


Bt.0017S 


PhosphogXycerate mutase 
family phosphohistidine 
proteins. 


BL0017SA 15.42 5.179e- 
12 6-26 BL00175C 
23.75' 8,062e-10 79-111 


1025 


PRO 03 05 


1 d- T _ -7 npoTPTM V'C^Ti. ' ' 
O — irt\\J I tUJLCI ctCMlA. 

SIGNATURE 


PR00305D 16.34 1.439e- 
10 158-185 


1026 


3L003S3 


HMGl/2 proteins. 


BL00353B 11.47 2.436e- 
0.0 ^^o—JotS iJLi0O3 53C 
14,83 8.844e-ll 288- 
335 


1028 


BJj00183 


Ubi qui tin- conjugating 
enzymes proteins. 


BL00183 28,97 l,310e- 
33 43-91 


1033 


PF00530 


UvrD/REP helicase. 


PP005eOA 13.37 4.720e- 
09 111-133 


1034 


PR00413 


HALOACID 

DEHALOGENASE/EPOXIDE 
HYDROLASE FAMILY 
SIGNATURE 


PR004X3E 15.78 3.429e- 
09 1S4-171 


1037 


PD01066 


PROTEIN ZINC FINGER 
ZINC-FINGER METAL- 
BINDING NU. 


PU0i066 19.43 9,657e- 
09 5-44 


1038 


Pb01796 


PROTEIN TRANSMEMBRANE 
COBALT ZINC CADMIU. 


PU0179S 15.61 4.259e- 
•11 5S-82 


1039 


BIi00299 


Ubiquitin domain 
proteins . 


BL00299 28.84 9.036e- 
09 17-69 


1040 


PR00970 


ARGININE ADP- 
RIBOSYLTRANSFERASE 


PR00970A 17.73 6.143e- 
20 56-78 PR00970D 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






SIGNATURE 


9.36 2.154e-18 154-171 
PR00970F 12.30 l.OOOe- 
16 224-241 PR00970G 
9.97 9,229e-lS 242-258 
PR00970B 16.37 1.290e- 
13 86-105 PR00970C 
11-05 1.643e-ll 115- 
130 PR00970E 11,23 
9.820e-ll 202-218 


1042 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 2.200e-10 
243-254 


1043 


PRO0048 


C2H2-TyPE 2INC FINGER 
SIGNATURE 


PR00048A 10.52 6.786e- 
13 114-128 PR00048A 
10,52 l,000e-09 172- 
186 


1045 


BliOOeiS 


C-type lectin domain 
proteins . 


BL00615A 16.68 1.720e- 
11 218-236 BL00615B 
12.25 1.857e-10 317- 
331 


1046 


BL01092 


Adenylate cyclases 
class- 1 proteins. 


BL01092N 13.54 8.924e- 
10 3-40 


1047 


BL01216 


ATP-citrate lyase / 
succinyl-CoA ligases 
family proteins . 


BL01216D 21. 7S 4.316e- 
28 314-344 BL0121.6a 
13.91 l.OOOe-10 97-112 


1049 


DM00031 


IMMUNOGbOBUIilN V REGION. 


•bM00031B 15.41 7.6iGe- 
12 102-136 


1050 


BI.01073 


Ribosomai protein L24e 
proteins . 


BL01073 24.30 l.OOOe- 
40 12-62 


1054 


Bl*00571 


Atnidasee proteins. 


BL00571 25.69 5.875e- 
31 160-2X2 


10S5 


BL00030 


Euka ryot i c RNA- bi nding 
region RNP-1 proteins . 


BL00030A 14.39 5,235e- 
11 98-117 BL0003OB 
7.03 4,3l6e-09 137-147 


1058 


BL.00223 


Annescins repeat proteins 
domain proteins . 


BL0Q223C 24.79 8.7S4e- 
23 262-317 BL00223A 
15.59 9.478e-14 46-80 
BL00223A 15.59 S.557e- 
11 118-152 


1060 


BL00027 


• Homeobox ' domain 
proteins . 


BL00027 26.43 3.455e- 
35 158-201 


1064 


BI,00455 


Putative AMP-binding 
domain proteins , 


BL004S5 13.31 6.21le- 
13 280-296 


1065 


PR00019 


liSUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 2.000e- 
09 115-129 PR00019B 
11.36 3,e80e-09 87-101 


1066 


PR00326 


GTPl/OBG GTP-BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 4.600e- '~ 
16 lSl-172 PR00326C 
9.79 1.290e-14 200-216 
PR00326B 16.74 8.54ae- 
14 172-191 PR00326D 
19.09 1.257e-13 217- 
236 


1071 


PD02870 


RECEPTOR INTERLEUKIN-1 
PRECURSOR. 


PD02870B 18.83 8,518e- 
11 164-197 


1072 


PFOoase 


SET domain proteins. 


PF00056A 26.14 S.976e- 
09 350-387 


1075 


BL01O09 


Extracellular proteins 
SCP/Tpx- 1/AgS/PR- 1/Sc7 
proteins . 


BL01003D 14.19 4. 3006- 
20 127-148 BL01009A 
13.75 6.586e-13 57-75 
BL01009E 13.50 1.439e- 
11 159-17S 


1077 


PR00724 


CARBOXYPEPTIDASE C 
SERINE PROTEASE (SIO) 
FAMILY SIGNATURE 


PRQ0724A 10.91 l.OOOe- 
08 366-379 


1078 


HL0021S 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 l.OOOe- 
12 170-195 BL0021SA 
15.82 7.S29e-10 79-104 


1079 


BL00678 


Trp-Asp (WD) repeat 


BL00678 9.67 4.3l6e-09 
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NO, 


DESCRIPTION 


RESULTS* 






proteins proteins. 


298-309 


1081 


BL00326 


Tropomyosins proteins. 


BL00326A 14.01 7,3986- 
10 23-57 


1094 


BI.Q0460 


Glutathione peroxidases 
selenocysteine proteins. 


BL00460A 28.67 3.204e-"" 
18 57-92 BL0O460B 
9.73 6.400e-13 100-118 
BL00460D 16.89 9 . 143e- 
12 162-182 BL00460C 
14,35 5.500e-09 133- 
156 


1095 


PD02811 


PROTEIN PEPTIDE 
REDUCTASE MG448 PI LB 
FIMBRIA TRAN. 


PD02811A 20.67 3.017e- 
22 67-105 PD02811B 
17.07 2.263e-21 118- 
151 PD02811C 13.25 
5.696e-13 154-167 


1096 


PD02811 


PROTEIN PEPTIDE 
REDUCTASE MG448 PI LB 
FIMBRIA TRAN. 


PD02811A 20.67 3.017e- 
22 60-98 PD02811B 
17.07 2.263e-21 111- 
144 PD02811C 13.2 5 
S.696e-13 147-160 


1097 


BX,00479 


Phorbol esters / 
diacyXglycerol binding 
domain proteins . 


BL00479B 12.57 6.143e- 
09 200-216 


1105 


PFOOaSl 


Nitroreductase family. 


PF00881A 27,15 9.229e- 
13 111-147 


1109 


PR00449 


TRANSFORMING PROTEIN P2l 
RAS SIGNATURE 


PR00449A 13.20 3.077e- 
10 15-37 PR00449E 
13.50 1.857e-09 185- 
208 PR00449D 10.79 
8.364e-09 131-145 


1115 


PRO 0405 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405B 11.83 5.737e- 
20 42-60 PR00405A 
17.71 2.703e-17 23-43 
.PR00405C 19.41 6.902e- 
10 53-85 


1116 


BL0035S 


HMG14 and HMG17 
proteins . 


BL003SS 5.97 2.S2ee-2S 
20-51 


1117 


BL00355 


HMG14 and HMG17 
proteins . 


BL00355 S.97 2,5286-25 
20-51 


1120 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 4.8S7e- 
10 290-306 


1123 


PRD0412 


EPOXIDE HYDRdlASE 
SIGNATURE 


PR00412F 18.76 9.526e- 
12 301-324 


1125 


PR00186 


HEMERYTHRIN SIGNATURE 


PR001B6A 13.62 2.800e- 
09 87-101 


1129 


BI*00170 


Cyclophilin- type 
peptidyl -prolyl cis- 
trans isomerase 
signatur. 


BL00170C 18.49 3,077e- 
33 84-129 BL00170B 
20,97 6,838e-2S 37-77 
BL00170A 17,08 3,45Se- 
15 10-37 


1131 


BL0O63G 


Nt-dnaJ domain proteins. 


BL00636A 8.07 S.304e- 
15 29-46 BL00636B 
15.11 1.360e-14 59-80 


1132 


BL0O678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 6,211e-09 
29-40 


1133 


Bi:i00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 6.211e-09 
29-40 


1136 


BL00990 


Clathrin adapror 
complexes mediam chain 
proteins - 


BL00990C 18.78 4.176e- 
38 235-269 BL00990A 
21.44 4.316e-36 94-132 
BL00990B 20.15 2,l2Se- 
27 157-187 BLO0990D 
16,13 5.320e-18 403- 
422 


1137 


PR00314 


CLATHRIN COAT ASSEMBLY 
PROTEIN SIGNATURE 


PR00314B 15.68 8.000e- 
34 100-128 PRO0314D 
9.66 3.531e-33 233-261 
PR00314C 16.05 8.909e- 
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32 159-188 PR00314A 
14.53 1.2816-22 13-34 


1139 


BIjOIIIS 


GTP-binding nuclear 
protein. ra,n proteins. 


BL01115A 10.22 6.364e- 
13 13-57 


1141 


BL00107 


Protein kinases ATP- 
binding region proteins . 


BLOOIOVA 18.39 4.00Ge- 
19 451-4B2 BL00107B 
13.31 3.077e-12 519- 
535 


114 8 


PR00685 


TRANSCRIPTION INITIATION 
FACTOR IIB SIGNATURE 


PR00685A 13.62 4.676e- 
09 21-42 


1155 


PD016S2 


RECEPTOR CELL NK ' 
GLYCOPROTEIN IMMUNOGLOB . 


PD01652B 8.50 9.396e- 

10 522-574 PD01652B 
8.50 9.463e-10 740-792 


1157 


PD02894 


HYDROLASE N4- PRECURSOR 
PROTEIN SIGNAL BE. 


PD02894A 21,96 7.873e- 
28 81-127 PD02894B 
13 .93 1. 188e-27 178- 
211 


llb9 


BL00623 


GMC oxidoreductases 
proteins . 


BL00623E 15.00 3.531e- 
20 391-414 BL00623C 
10.85 4.240e-20 155- 
176 


1161 


PD01937 


DNA PROTEIN POLYMERASE 
ENDONUCLEASE DKA- . 


PD01937A 6.68 3,47Se- 
09 33C-341 


1162 


PD01937 


DNA PROTEIN POLYMERASE 
ENDONUCLEASE DNA- . 


PD01937A 6,68 3.475e- 

09 221-232 


1163 


PR00624 


HISTONE H5 SIi3NATURE 


PR00624D 11.94 7. 45Se- 
10 214-239 PR00624D 
11.94 1.961e-09 312- 
337 


1167 


BL00226 


Intermediate filaments 
proteins . 


BL00226B 23,86 7.384e- 
09 302-350 


1177 


BL01032 


Protein phoaphatase 2C 
proteins . 


BL01032G 8.33 1.422e- 
10 34-48 


1178 


PRO 03 20 


G- PROTEIN BETA KD-40 
REPEAT SIGNATURE 


PR00320A 16.74 1 . 794e- ' 
10 205-220 PR00320C 
13.01 7.a40e-10 205- 
220 PR00320B 12.19 
8.457C-10 35-50 
PR00320A 16.74 7.146e- 
09 35-50 PR00320B 
12.19 9.100e-09 79-94 


1180 


PR00454 


ETS DOIVIAIN SIGNATURE 


PR00454D 10.89 4.150e- 
19 765-784 


1181 


BL00291 


Prion protein. 


BL00291A 4,49 8.962e- 
11 152-167 


1184 


BL00720 


Guanine-nucleotide 
dissociation stimulators 
CDC25 family sign. 


BL00720B 16,57 4.103e- 
IS 1089-1113 


1185 


BLC021S 


Mitochondrial energy 
transfer proteins. 


BL0021SA 15.82 4,553e- '" 
13 204-229 BL0021SA 
15.82 1.429e-12 11-36 
BLa0215A 15.82 9,809e- 
11 104-129 


1187 


BL00983 


Ly-6 / u-PAR domain 
proteins , 


0xjw:7o^u J.Z . oy ^. /tile- 
10 77-93 


1188 


BI.00878 


Om/DAP/Arg 

decarboxylases family 2 
pyridoxal-P attachment 
si. 


BL00878B 10.95 6.000e- 
16 189-204 BL00878C 
17.74 8,43Se-lS 225- 
245 BL00878F 19.67 
3.625e-13 379-402 
BL00878D 16.56 l,621e- 
09 270-289 


1191 


PD0293 9 


PROTEIN GLUTATHIONE 
SYNTHETASE SY. 


PD02939B 10.10 2,723e- 
12 203-220 PD02939C 
20.01 l.OOOe-11 224- 
252 


1193 


PR00345 


STATHMIN FAMILY 
SIGNATURE 


PR00345B 7.12 2.800e- 
28 72-101 PR0034SE 
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8.54 7.652e-28 149-174 
PR0034SC 4.54 9.100e- 
28 101-125 PR0034SD 
10.97 1.964e-24 125- 
149 PR00345A 13,46 
5.545e-l6 43-62 


1X94 


PR00345 


STATHMXN FAMILY 
SIGNATURE 


PR0034SB 7,12 2.800O- 
28 lOe-137 PR00345E 
8.54 7.652e-28 185-210 
PR00345C 4-54 9.10Qe- 
28 137-161 PR00345D 
10.97 1.9643-24 161- 
185 PR00345A 13.46 
S.645e-16 79-98 


1195 


PF00995 


Seel family. 


Pr00995B 17.37 1.120e- 
13 224-264 


1196 


BL00932 


Bacterial-type phytoene 
dehydrogenase proteins . 


BL00982A 18.41 6. 7 3 Be- 
ll 15-47 


1197 


BL01298 


D i hydrodipi col ina te 
reductase proteins. 


BL01298A 13.90 5.959c" 
09 51-73 


1203 


BL00061 


Short-chain 

dehydrogenases /reductase 
s family proteins , 


BL00061B 2S.79 l.OOOe- 
14 152-190 


1204 


PR00118 


BETA- LACTAMASE tUiSS A 
SIGNATURE 


PROOllBF 16.42 9.3 86e- 
09 213-229 


1206 


BL01183 


ubiE/COQ5 

methyltransf erase family 
proteins . 


BL01i83B 21.31 1.429e- 
37 184-229 BL01183D 
27.71 8.53Se-27 264- 
307 BL01183A 13.25 
3.250e-23 51-73 
BL01183C 10.77 5.295e- 
09 246-2S8 


1208 


BL00979 


G -protein coupled 
receptors family 3 
proteins , 


BL00979L 20.63 2.485e- 
09 105-146 


1209 


PFC0023 


Ank repeat proteins. 


PF00023A 16.03 4.8S7e- ' 
11 49-65 P?00023B 
14.20 1.8l8e-09 45-55 


1212 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00Q48A 10.52 7.7S0s- 
14 227-241 PR00048A 
10.52 4.316e-ll 199- 
213 


1213 


PR604S0 


RECOVERIN FAMILY 
SIGNATURE 


PR00450C 12.22 1.720e- 
10 20-42 PR00450C 
12.22 3,506e-09 56-78 
PR00450D 16.53 6.769e- 
09 44-64 


1216 


BL00412 


IMC u.^ (juiuuux xn Vo/iir— flt J / 

proteins , 


BL00412D 16.54 5-598e- 
10 179-230 


1219 


PR004S6 


SIGNATURE 


exivV^^ba 3,0b S.348e- 
11 249-264 


1222 


PD00066 


PROTEIN ZINC- FINGER 
METAL- HINDI . 


PD00066 i3.92 7,231e- 
15 295-308 PD00066 
13,92 7.231e-15 406- 
419 PD00066 13.92 
2.286e-12 378-391 
PD00066 13.92 7.857e- 
12 434-447 PDQ0066 
13.92 3.348e-ll 350- 
363 


1223 


BL50058 


G-protein gamma subunit 
profile. 


BL50058 27.23 l.OOOe- 
40 13-61 


1226 


BL6b412 


Neuromodulin (GAP- 43) 
proteins , 


BL00412D 16,54 8.439e- " 
09 279-330 


1227 


BL.00437 


Catalase proximal heme- 
ligand proteins. 


BIi00437A 18.82 l.OOOe- 
40 49-101 BL00437B 
16.28 l.OQOe-40 114- 
168 BL00437C 21.86 
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NO. 
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l.OOOe-40 190-239 
BL00437D 25.72 l.OOOe- 
40 246-301 BL00437E 
23.95 l.OOOe-40 327- 
379 




cSi>UX JLbU 


Kinesin light chain 
repeat proteins. 


BL01160B 19,54 8.297e- 
10 6-60 


1231 


PR00735 


GLYCOSYL HYDROLASE 
FAMILY 8 SIGNATURE 


PR0073SA 11.19 6.857e- 
09 391-405 


1232 


PR00497 


NEUTROPHIL CYTOSOL 
FACTOR P4 0 SIGNATURE 


PR00497A 6.92 5,SS3e- 
10 158-176 


1233 


PR00497 


NEUTROPHIL CYTOSOL 
FACTOR P40 SIGNATURE 


PR00497A 6.92 S.SSie- 
10 1S8-176 


1235 


BL00866 


Carbamoyl -phosphate 
synthase subdomain 
proteins . 


BL00866B 36.29 2,776e- 
09 75-121 


1237 


BL00027 


'Homeobox' domain 
proteins . 


BL00027 26.43 1.818e- 
21 36-79 


1243 


PR00403 


WW DOMAIN SIGNATURE 


PR00403B 12.19 1,104c- 
11 10-25 


1246 


PD01168 


SYNTHETASE LIGASE 
PROTEIN ALANYL. 


PD01168L 9,47 2.837e- 
10 31-46 PD01168L 
9.47 4.490e-10 174-189 
PD01168L 9.47 7,612e- 
10 183-198 


124 9 


BLOOOIS 


EF-hand calcium-binding 
domain proteins. 


BL00018 7.41 2,800e-10 
183-196 


1254 


BL00133 


Obiquitin-con:Jugating 
enzymes proteins . 


BL00183 28.97 2.440e- 
36 96-144 


1255 


BLOlllS 


GTP-binding nuclear 
protein ran proteins . 


BL01115A 10.22 5.6705- 
11 8-S2 


1256 


BL00373 


Phosphor ibosylg ly c inami d 
e formyltransferase 
proteins. 


BL00373C 10,35 3:3486- 
12 143-156 


1258 


PROOOll 


TYPE III EGF-LIKE 
SIGNATURE 


PROOOllB 13.08 3.217e- 
10 174-193 


1259 


BLOOSia 


Zinc finger, C3HC4 type 
(RING finger) , proteins , 


BL00518 12.23 6.266e- 
10 31-40 


1261 


PR00070 


DIHYCROFOLATE REDUCTASE 
SIGNATURE 


PR00070D 11.63 l.OOOe- 
15 112-127 PR00070C 
13.09 9.500e-15 51-63 
PR0q07OA 12.92 5.S00e- 
12 16-27 


1262 


BL00462 


Gamma - 

glutamyltranspeptidase 
proteins. 


BL00462A 20.89 6.438e- 
24 140-183 BL00462B 
17.88 5.500e-20 230- 
267 BL00462C 27.41 
2.023e-ll 292-347 


1263 


OT A ft AO O 


Wy^^'typc, 'iieiix-loop- 
helix" dime rizat ion 
domain proteins • 


BL00038B 16.97 9.455e'^^ 
11 62-83 


1264 


BLOlllS 


GTP-binding nuclear 
protein ran proteins . 


BL01115A 10.22 5.670e- 
11 17-61 


1266 


PRO 083 7 


ALLERGEN V5/TPX-1 FAMILY 
SIGNATURE 


PR00837r 17 O n 
IT \j o ■s f ^ jt f » tid. jd./x^e 

18 165-182 PR00837A 

14.77 4.512e-12 86-105 

PR00637D 11.12 7.577e- 

12 201-215 


1269 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449C 17.27 9.308e- 
22 40-63 PR00449E 
13.50 l.OOOe-16 137- 
160 PR00449D 10.79 
3.520e-ll 102-116 


1270 


BL00276 


Channel forming colicins 
proteins. 


BL00276A 8.87 1.500e- 
09 17-29 


1275 


PD02327 


GLYCOPROTEIN ANTIGEN 
PRECURSOR IMMUNOGLO- 


PD02327C 15.47 9.769e- 
09 220-243 


1276 


PRO 04 12 


EPOXIDE HYDROLASE 


PR00412B 12,59 7.894e- 
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SEQ ID NO: | ACCESSION 
NO. 



1277 



1280 



128S 



1287 



1292 



1297 



1298 



1302 



1307 



1308 



DESCRIPTION 



PF00756 



BL00I34 



BIj01220 



BX.00S18 



PF00791 



PR0a802 



PR00716 



BL00478 
BL00127" 



PRO 063 7 



BL00215 



PRC0898 



PDQ0301 



SIGNATURE 



Putative esterase - 



Serine proteases, 
trypsin family, 
histidine proteins. 



Phospha t idylethanolamine 
-binding protein family 
proteins . 



Zinc finger, C3HC4 type 
(RING finger) , proteins . 



Domain present in 20-1 
and UncS-like netrin 
receptors. 



SERUM ALBUMIN FAMILY 
SIGNATURE 



M- PHASE INDUCER 
PHOSPHATASE SIGNATURE 



RESULTS* 



12 119-135 PR00412C 
11.30 1.8S7e-ll 16S- 
179 PR00412A 13,23 
3 .400e-ll 100-119 



PF007S6C 14.12 5.538e- 
10 127-157 

liL00134A 11.96 9.325e- 
13 128-145 



BL01220C 14.75 9.348e- 
15 248-276 



BL00S18 12.23 2.28ee- 
10 33-42 



PF00791B 28.49 7.1B2e- 
11 288-343 



PR00802B 16. SI l.GlOe- 
10 81-105 



LIM domain proteins. 



Pancreatd g ribonuclease 
family proteins. 



TYPE 3 BOMBESIN RECEPTOR 
SIGNATtTRE 



Mitochondrial energy 
transfer proteins. 



PR00716C 17.65 5.696e 
09 23-44 



BL00478B 14.79 6,47Be- 
14 268-283 



BL00127C 31.49 3.S71e- 
28 82-126 BL00127B 
26.57 8.600e-28 23-68 



PR00637E 11.27 4.2S0e- 
09 290-306 



VASOPRESSIN V2 RECEPTOR' 
SIGNATURE 



PROTEIN REPiSAT MUSCLE 
CALCIUM-BI, 



BL0021SA IB. 82 S.SOOe- 
17 13-38 BL0021SA 
IB. 82 l,000e-16 226- 
251 BL00215A 15.82 
2.658e-13 107-132 



PR00898H 11,34 4.662e- 
09 552-572 



PD00301B 5.49 '2. '73 le^ 
09 390-401 



BL00983 



Ly-6 / u-PAR domain 
proteins. 



BL00983C 12.69 9.654e- 
13 73-89 BL00983B 
8.19 3.132e-09 12-22 



1314 



1316 



1327 



1329 



1331 
1332 



BLO0S94 



BL00134 



BLO0783 



PF00514 



BL00030 



PR00497 



Thioredoxin family 
proteins . 



Aromatic amino acida 
permeases proteins . , 



BL00194 li,l6 1.90"6e 
11 15.-28 



Serine proteases, 
trypsin family, 
histidine proteins. 



BL0OS94A 16,75 8.969e- 
10 53-97 



BL00134A 11.96 9.325e- 
13 128-145 



Ribosomai protein L13 
proteins . 



Arniadiilo/Jt>eta-catenin- 
like repeat proteins. 



BL00783C 22.43 6.559e-" 
24 07-117 BL00783A 
14.55 l,e00e-19 8-33 
BL00783B 12,76 3.500e- 
12 74-86 



Eukaryotic RNA-binding 
region RNP-l proteins. 



i'F00514A 31,30 7.268e-'" 
11 82-120 



NEUTROPHIL CYCOSOL 
FACTOR P40 SIGNATURE 



BL00030A 14.39 6.294e- 
11 129-148 BL00030B 
7.03 4.7896-09 168-178 



PRO 04 9 7A 6.92 7..239e- 
09 25-43 



1333 



1336 
"l33 7 



PD01066 



NICKEL - DEPENDEirr 

HYDROGENASE/B-TYPE 
CYTOCHROME SIGNATURE 
PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 



PR00161C 9.51 4.930e- 
09 317-337 



PD01066 19,43 6.769e- 
33 10-49 



PR00700 
PRO 0700 



PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 
PROTEIN TYROSINE 



PR00700D 12.47 2.200e- 
09 262-281 



PR0070QD 12.47 2.200e- 
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SEQ ID NO: 


ACCESSION 
WO . 


DESCRIPTION 

PHOSPHATASE SIGNATURE ' 


RESULTS* 


1340 


PR00860 


VERTEBRATE 

METALLOTHIONEIN 

SlGNATXniE 


09 211-230 

PR00860A 5.46 5,034e- 
13 5-18 


1341 


BL,O0893 


fimtT domain proteins. 


Jtil>Q0893 18.99 6.750e- 
16 46-71 


1343 


BL01282 


BIR repeat proteins , 


BL012B2B 30.49 S,974e- " 
21 383-422 


1344 


DM00099 


4 Jew A55R REDUCTASE 
TERMINAL 

DIHYDROPTERIDINE . 


DM00099B 14.73 e.313e-"' 
09 417-427 


134S 


BL00923 


Aspartate and gXutamate 
racemases proteins. 


BL00923B 11.41 5.935e- 
10 135-146 


1348 


PFa0651 


BTB (also knovm as BR- 
C/Ttk) domain proteins. 


PF00651 is. 00 7.231e- " 
13 44-57 


1350 


PR00193 


MYOSIN HEAVY CHAIN — ""' 
SIGNATURE 


PR00193D 14,36 3.57le- 
32 416-445 PR00193C 
12-60 6.3iee-3l 179- 
^u/ i:'KuC'193B 11.69 
3.57le-24 133-159 
PRQ0193E 19.47 9.069e- 
22 470-499 PR00193A 
15.41 1.783e-20 77-97 


13S2 


PR00447 


NATURAL RESISTANCE- 
ASSOCIATED MACROPHAGE 
PROTEIN SIGNATURE 


PR00447E 9.73 l.£'S4e- 
15 299-319 PR00447D 
13.54 3.408e-lS 200- 
224 PR00447A 12.73 
6.357e-ll 97-124 
PR00447G 6.69 9.877e- 
10 353-373 


1353 


BI,00303 


S-ioo/lcaBP type calcium 
binding protein. 


BLOO303A 21.77 6.667©- " 
26 45-82 BL00303B 
■^o.x:^ J. • uuue — 24 93 — 130 


1355 


BIjOO039 


DEAD- box subfamily ATP- 
dependent helicases 
proteins. 


BL00039D 21.67 5.95Qe- ' 
29 375-421 BL00039A 
18.44 7.136e-29 99-138 
BL00039C 15.63 4.000e- 
18 225-249 BL00039B 
19.19 3.182e-14 141- 
167 


1357 


PF00615 


Regulator of G protein 
signalling domain 
proteins. 


PF00615B 16.25 2.216e- ' 
12 84-101 PF00615C 
10.06 8.412e-12 162- 
176 


1360 


PDO1066 


PROTEIN ZINC FINGER 
-ZINC- FINGER METAL- 
BIWDIMG Na. 


PU01066 19.43 9,234e- 
29 10-49 


1361 


PRQ0925 


NONHISTONE CHROMOSOMAL 
PROTEIN HMG17 FAMILY 
SIGNATURE 


PR0092SA 5.47 S.091e- 
18 14-29 PR0092SB 
3.73 6.143e-14 29-42 
PR00925C 5.57 4.789e- 
12 53-64 PR0092SD 
6.56 l.aS7e-10 76-87 


1362 


BIj01272 


GlucoJcinase regulatory 
ftocexn ramiiy proteins. 


BL01272B 19.61 6.870e- ~ 
30 136-171 BL01272C 
11.68 3.3I4e-2S 249- 
274 BL01272A 6.49 
1.23le-18 99-117 


1363 


BL01372 


Glucokinase regulatory 
protein family proteins. 


BL01272B 19.61 6.8706- ' 
30 113-148 BL01272C 
11.68 3.3l4e-2S 226- 
251 BL01272A 6.49 
1.23le-18 76-94 


1364 


DM00179 


W KINASE ALPHA ADHESION 
r-CELL. 


DM00179 13.97 5.304e- 
09 167-177 


1368 " 
1370 


1>R00169 

pRooses 1 


POTASSIUM CHANNEL 
SIGNATURE 

URIDINE KINASE SIGNATURE 


FKU0169A 16.77 1.592e- 
09 76-96 

PR00988A 6.39 1.794e- 
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SEO ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








10 1-19 


1371 


BL00242 


Integrins alpha chain 
proteins . 


BL00242B 8.13 8.615e- 
09 469-479 


1372 


PR0062S 


DNAJ PROTEIN FAMILY 
SIGNATURE 


PR00625B 13.48 7.353e- 
19 46-67 PR0062SA 
12.84 1.391e-16 14-34 


1373 


BL00^34 


HSF-type DNA- binding 
domain proteins. 


BL00434C 23.65 3-778e- 
09 90-130 


1374 


PR00962 


LETHA1,(2) GIANT LARVAE 
PROTEIN SIGNATURE 


PR00952C 8.00 6.337e- 
09 505-526 


13 75 


PD0247S 


MUCIN EPITHELIAL TUMOR- 
ASSOCIATE. 


PD0247SA 23.18 a,552e- 
10 1111-1150 


1376 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 9.S7ie- 
32 24-63 


1380 


BL00194 


Thioredoxin family 
proteins. 


BLC0194 12.16 8.333e- 
12 48-61 


1381 


DM01970 


0 kw ZK632.12 YDR313C 
ENDOSOMAL III. 


DMbl9 70t5 1 At^H/> "" 
*x< 4v * ^ f o w KfV X.ftSOc — 

15 1123-1136 


1383 


BL00678 


Trp-Asp (WD) repeat 
proteins pirotcins. 


BL00678 9.67 7,500e-10 
243-254 


1384 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins . 


BL00678 9.67 7.600e-10 
271-282 


1385 


BIj00303 


s-loo/iCaBP type calcium 
binding protein. 


BL00303B 26.15 6.2Q3e^: 
10 95-132 


1386 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 S.042et 
09 1574-1628 


1387 


BL00S18 


Zinc finger, C3HC4 type 
(RING finger) , proteins . 


BL00S18 12.23 l.OOOe- 
11 52-61 


1389 


PD01066 


ZINC-FINGER KETAL- 
BINDINO NU. 


PD01066 19,43 3.6O0e- 
30 10-49 


1390 


PD01066 


PROTEIN ZINC FINGER 
ZINC-FINGER METAL- 
BINDING NU. 


PD01066 19,43 3.Slie- 
31 32-71 


1392 


PR00308 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308C 3.83 9.723e- 
10 127-137 


1393 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PRO0380A 14-18 9.62Se- " 
£=> ob-llO JPRO 03 80D 
9.93 2.406e-20 304-326 
PR00380B 12.64 4.414e- 

13:18 6.538e-16 243- 
262 


1394 


PD0O06S 


PROTEIN aiNC- FINGER 
METAL- BINDI. 


PD00066 13,92 3.400e- 
14 462-475 PD00066 
13-92 8-800e-14 348- 
361 PD00066 13.92 
9.571e-12 405-418 * 
PD00066 13.92 6.087e- 
11 490-S03 PD00066 
13.92 8.043e-ll 320- 
333 


1398 


PU01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAt- 
BINDING NU. [ 


PD01066 19.43 6.786c- 
32 10-49 


1400 


DM01206 


COROWAVIRUS NC7CLEOCAPSID 
PROTEIN. 


DM01206B 10.69 7.038e- 
09 270-290 


1406 


PD0O930 


PROTEIN 6TFASE DOMAIN 
ACTIVATION. 


PD0093 0A 25.62 7-324e- ~ 
15 363-389 


1407 


Bti00a30 


Eukaryotic RNA- binding 
region RNp-i proteins. 


BL00030A 14.39 7,SO0e- 
10 457-476 


1408 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 9.550e- 
11 179-193 PR00019A 
11.19 8.826e-10 228- 
242 PR00019B 11.36 
1.360e-09 199-213 
PR00019B 11,36 4.960e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








09 176-190 


1409 


PRO0510 


NEBULIN SIGNATURE 


PR00510A 9.09 4.150e- " 
12 182-202 PR00510B 
12.96 8.767e-12 210- 
230 PR00510F 9.88 
8.172e-10 58-75 
PR00510D 9.21 2.3G7e- 
09 251-267 


14 16 


JrUUOO 78 


REPEAT PROTEIN ANK 
NUCLEAR ANKYR. 


PD00078B 13.14 5.696e- 
09 31-44 


14 12 


hJijUO Jt)o 


Ribosomal protein L5 
proteins . 


BL00358B 22, 'lb l.OOCe- 
40 57-103 BL003S8C 
13.75 6.087C-14 122- 
136 BL00358D 14.26 
S.500e-13 143-158 
BL00358A 13.06 1.931e- 
11 33-44 


1414 


BLO0282 


Kazal serine protease 
inhibitors family- 
proteins . 


BL00282 16.88 7.338e- 
10 511-534 


1415 


BL00023 


Type II fibronectin 
collagen-binding domain 
proteins. 


BL00023 24.31 4.300e- 
29 40-77 


1417 


PR00681 


RIBOSOMAL PROTEIN SI 
SIGNATURE 


PR00681G 12.54 2 . 149e- 
09 38-60 


1418 


DM00973 


3 Jew RESISTANCE BENOMYL 
yi*I*028W CyCLOHEXIMIDE . 


DM00973A 21.17 l,462e- 
09 171-208 


1419 


PRO 03 19 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00319B 11.47 1.571e- 
09 428-443 


1420 


PD01941 


TRAWSMEMBRANE 
COTRANS PORTER SYMP. 


PD01941A 14.81 l.OOOe- 
40 142-196 PD01941B 
15,02 7.049e-30 400- 
447 PD01941E 15.92 
2.475e-20 817-864 
PD01941C 19.96 3.118s- 
19 488-543 PD01941D 
'^':.18 $.614e-18 641- 
€90 PD01941F 28.52 
S.3S2e-lS 1038-1093 


J-'i 


fKU U J U D 


CADHERIN SIGNATURE 


PR0020SB 11.39 8.0436- 
12 199-217 


1423 


PRO 02 09 


ALPHA/BETA GLIADIN 
FAMILY SIGNATURE 


PRQ0209B 4.88 6.318e- 
11 1009-1028 


1424 




Src homology 3 {SH3 ) 
domain proteins profile. 


BL50002A 14.19 8,200e- 
14 367-386 BIi50002A 
14.19 9,2S0e-12 298- 
317 BLj50002A 14.19 
4.462e-ll 208-227 

09 244-258 


1425 


PF00628 


PHD-f inger . 


PF00628 15.84 3-045e- 
12 330-345 


1426 


PP00e28 


PHD- f iriger . 


12 377-392 


1427 


PR0O405 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR0040SB 11.83 5,114e- 
16 281-299 PR00405A 
17.71 4.306e-14 262- 
282 


1428 


BIjOO039 


DEAD -box subfamily ATP- 
dependent heli cases 
proteins . 


BL00039D 21.67 5-219e- 
34 147-193 


1429 


PR00320 


G~ PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320C 13.01 8.920e- 
10 577-592 


1430 


PR0O378 


INOSITOL PHOSPHATASE 
SIGNATURE 


PR00378D 16.86 7.563e- 
12 295-314 PR00378B 
13.80 8.6S0e-10 166- 
186 


1431 


PRO 0928 


GRAVES DISEASE CARRIER 


PR00928B 13.53 3 , 769e- 
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SEQ ID NO: 


ACCESSION 
NO. 


U6SCRIPTI0N 
PROTEIN SIGNATURE 


RESULTS* 


1433 


BL,01113 


Clq domain proteins. 


10 103-124 

BL01113B 18-26 7^b49e- 
15 14-50 BIi01113C 
13.18 7.000e-12 82-102 


1434 


PR00319 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00319B 11.47 7.983e- 
10 135-150 


1436 


BL00030 


Eukaryotxc RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 l.OOOe- 
12 84-103 


1438 


BL00290 


immunogiobTiiins and 
raa^or histocompatibility 
complex proteins. 


BI4OO29OB 13.17 2.500e- 
09 250-268 BIi0029OA 
20.89 4. OOOe-09 188- 
211 


1440 


PROOS06 


VINCUIiIN SIGNATURE 


PR00806B 4.28 4.960e- 
09 38-52 


1441 


PH00806 


viNfJUhltt SIGNATURE 


PR0O806B 4.28 4.960e- 
09 88-102 


1444 


BL00422 


Granins proteins. 


BI*00422D 19.48 l.OOOe- 
08 114-138 


1445 


PD01841 


PHOSPHORYLASE KINASE 
ALPHA MUSCL. 


PD01841A 21.71 l.OOOe- " 
40 73-123 PD01841B 
14.35 l,000e-4Q 144- 
185 PD01841D 17.87 
l,0O0e-40 206-258 
PD01841P 13.36 l.OOOe- 
40 296-345 P0O1841G 
24.26 l.OOOe-40 349- 
403 PD01841I 23.00 
l.OOOe-40 494-536 
PD01841J 14.94 l.OOOe- 
40 89S-932 PD01841L 
18.42 l.OOOe-40 1083- 
JLl^b JPU01841E 18.60 
9.7l9e-3S 2S8-256 

JrUUXOfklA X4.01 l.OOOe — 
3S 1041-1071 Pi)01841H 

aX.JU .3.AOjB**JJ, %JJ> — 

472 PD01841C 13 , 78 
1.000e-2S 1B5-206 
PD01841M 10.82 1.2S0e- 
20 117S-1194 




1446 


PF00816 


H-NS hisfcone *family . 


PF00816B 13.84 a.875e- 
09 190-220 




1447 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 2,080e- 
09 402-416 




1448 


DM0031S 


072 RIBONUCLEASE 
INHIBITOR. 


DM00315D 18.40 7.393e- 
09 23-67 


1451 


BL00030 


Eukaryotic RNA-binding 
region rnp-i proteins. 


BI.00030B 7.03 2.800e- 
10 94-104 




1454 


DM01688 


2 POLY-IG RECEPT'OH. 


DM01688D 13.44 7.146e- 
09 382-405 




1455 


PF00777 


Sialyl transferase 
family. 


PF00777C 18.60 2.929e- " 
22 4-59 




1457 


BL00927 


Trehalase proteins. 


BL00927C 10.83 8.085e- 
09 42-53 




1460 


BL00545 


Aldose l-epimerase 
proteins - 


BL00S45C 11.28 7.353e- " 
17 169-182 BLO0S45A 
10.20 2.071e-lS 73-89 
BL0054SB 13.10 3.942e- 
09 140-153 




1466 


PR00097 


RWfHRANlLATE "SYNTHASE 
component' II SIGNATURE 


PR00097C 9.42 9.b69e- 
09 233-245 




1472 


3L01129 

] 


Hypothetical 
^abO/yceC/sfhB family 
proteins. 


BL01129E 13.25 S.250e- " 
22 170-195 BL01129C 
25.56 9.S26e-18 63-106 




1473 I 


3L00790 ] 
< 


Receptor tyrosine kinase 
=lass V proteins. 


BL00790I 20.01 2.821e- 
09 2114-2145 




L47S i 


3F00686 ■ -< 
I 


starch, binding domain 
aroteins- 


PF00686A 13.45 9.100e- 
09 267-277 
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SEQ ID NO: 


ACCESS ION 
NO . 


DESCRIPTION 


RESULTS* 


1477 


PF00566 


Probable zrabGAP domain 


PPO0S66A 12,64 7,333e- 

XV/ *t o O — fO 


1478 


BL00030 


Eukaryotic RNA-biading 
region RNP-1 proteins. 


BL00030B 7.03 9.400e- 
10 43-53 


1479 


DM00406 


GLIADIN . 


292-305 


14 80 


BL00290 


Immunocfioloul ins snd 
major histocompatibility 
complex proteins. 


15 69-87 BL00290A 
20.89 5.091e-ll 12-35 


1481 


pp n rn c; n 

JtrrtU \JX D U 


CARBOXYLASE SIGNATURE 


PR00150F 10.45 9.0396- 
09 21-51 


1482 


PF00780 


Domain found in NIKl- 
like kinase^/ mouse 
citron and yeast ROM, 


PF00780I 14.69 4,82Se- 
09 107-137 


1483 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 1.153e- 
09 108-162 


1485 


PD0i066 


PROTEIN ZINC FINGER 
2INC-FINGER METAL- 
BINDING NU. 


PD01066 19.43 S.909e- 
25 17-56 


1486 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BI,00107B 13.31 1.529e- 
09 34-50 


1488 


BIi0 0039 


DEAD- box subfamily ATP- 
dependent helicases 
proteins . 


BL00039D 21.67 9.586e- 
10 116-162 


1490 


BL00166 


Enoyl-CoA 

hydra t a s e / i some r a s e 
proteins , 


BL00166D 22.87 2.607e- 
24 190-226 BL00166C 
18-93 5.500e-14 140- 
167 BL00166B 16.92 
9.357e-ll 93-115 


1491 


BL004S2 


Guanylate cyclases 
proteins . 


BL004S2D 28.59 3.700e- 
31 63-106 DL00452E 
11.92 3.045e-13 115- 
131 


1492 


PRO 0019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 3.6676- 
09 532-546 


1497 


BLOOlO? 


Protein kinases ATP- 
binding region proteins . 


BL00107B 13,31 l.OOOe- 
11 384-400 BL0010'7A 
18.39 5.345e-ll 322- 
353 


1500 


PF00876 


Ogre family. 


PFO0a76E 7.99 1.947e- 
10 107-117 


1502 


BIi0O027 


' Homeobox ' domain 
proteins . 


BL00027 26.43 4.789e- " 
24 112-155 


1503 


BL00027 


* Homeobox • domain 
proteins. 


BLO0O27 26.43 4.789e- 
24 112-155 


1305 


BI,01177 


Anaphy la toxin domain 
proteins. 


BL01177E 20.64 5.800e- 
24 448-475 BL01177C 
17-39 S.333e-19 402- 

aii01177B iJ.ol 
7.840e-16 155-171 
BL01177n 1 7 ^rt T *infj#^-^ 
15 427-445 


1506 


BIj0 0972 


Ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins. 


BL00972D 22.55 5,500e- 
14 311-336 BL00972A 
11.93 7.429e-14 48-66 
BL00972E 20.72 8.7S9e- 
10 341-363 


1512 


BIi00523 


Sulfacases proteins. 


BL00523E 19.27 4.536e- 
22 76-106 BL00523D 
9.89 l.S63e-ll 40-52 
BLO0S23F 10.85 4.162e- 
09 159-170 BL00523G 
9,46 5.333e-09 256-266 


1516 


BIi00914 


Syntaxin / epimorphin 
family proteins. 


BL00914 24.91 7.045e- 
14 168-218 


1518 


BL00600 


Aminotransferases class- 
Ill pyridoxal-phosphate 
attachment si. 


BL00600A 17.98 6.143e- 
19 98-122 BL00600E 
16.43 1.77le-17 302- 
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SEQ ID NO: 


ACCESSION 
KO. 


DESCRIPTION 


RESULTS* 








331 BL006COG 12,43 
9.62Se-17 377-396 
BL00600B 19.60 5.091e- 
15 160-186 BL00600C 
16.18 6.04Ce-l2 190- 
206 BL006COF 8.77 
l.OOOe-11 343-356 
BL00600D 8.71 l.OOOe- 
10 281-295 


1523 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PO00930B 33.72 9.600e- 
18 41-82 


1528 


PR00320 


G- PROTEIN BETA V7D-40 
REPEAT SIGNATURE 


PR00320B 12.19 4.774e- 
11 192-207 PR00320B 
12.19 8.e39e-ll 272- 
287 PR00320B 12.19 
9.743e-10 106-121 
PRO0320A 16.74 l.a78e- 
09 192-207 PR00320A 
16.74 2.317e-09 106- 
121 PR00320A 16,74 
8.683e-09 272-287 
PR00320C 13.01 8.800e- 
09 106-121 


153 8 


DM01970 


0 kw 2K632.12 YDR313C 
END0S0M7VL III. 


DM01970B O.GO 4.508e- 
15 171-184 


1539 


PF00781 


Diacylglycerol kinase 
catalytic domain 


PF00781D 11.11 7.593e- 
10 103-127 


1540 


PR00965 


OCULAR ALBINISM TYPE 1 


PR00965H 10.73 1.231e- 

0I2-334 FR0u96SE 
12.93 5.e46e-29 172- 
195 PR0096SF 5.98 
1.123e-28 209-231 
PR0096SC 15.04 l.OOOe- 

5.84 l.OOOe-27 150-170 
PR00965G 8.52 2.440e- 
27 258-279 PR00965B 
4.80 8.650e-26 88-109 
PR00965A 12.52 l.OOOe- 
25 35-55 PR00965I 
3,91 6.442e-25 385-406 


1541 


BX.01013 


Oxysterol-binding 
protein family proteine. 


BL01013D 26.81 9-719e- 
17 163-207 


1543 


PD02699 


PROTEIN DNA-BINDING 
BINDING DNA. 


PD02699C 24.84 l.OOOe- 
40 539-646 PD02699A 
8.91 2.286e-34 219-248 
PD02699B 18.28 6.143e- 
21 485-509 


1544 


PRO 0049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00O4dD 0.00 7.857e- 
10 102-197 PR00049D 
0.00 7.102e-09 67-82 


1547 


Bl,00951 


ER lumen protein 
retaining receptof 
proteins . 


BL00951C 19.35 l.OOOe- ' 
40 93-142 BL009S1D 
13.94 8.714e-40 142- 
177 BL00951A 15.10 
l.OOOe-38 2-38 . 
BL00351B 14-23 6,250e- 
33 38-69 


1548 


BLO 0536 


Ubigui tin-activating 
enzyme proteins. 


BL03536F 13.65 8.520e- 
30 279-318 BL00536D 
22.91 S.737e-24 21-65 
BL00S36E 16.94 4 .696e- 
18 248-279 


1549 


PRO 013 9 


ASPARAGINASE/GLUTAMINASE 
FAMILY SIGNATURE 


PR00139C 11,72 9.679e- 
09 550-569 


15S3 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 S.119e- 
U9 58-73 
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SEQ ID NO: 
iSS6 


ACCESSION 
WO. 


OiiSCRIPTION 


RESULTS* 




BI,00061 " 


short-chain 

dehydrogenase 3 /reduc t ase 
a family proteins. 


BJbOOOGlB 25.79 6 . 276e- 
13 67-105 


1557 


BL01228 


Hypothetical cof family 
proteins . 


au\JJ.^AOU JL / , O.lOSe- 

12 107-132 


1558 


BL01228 


Hypothetical cof family 
proteins. 


12 107-132 


1559 


BL012a8 


Hypothetical cof family 
proteins . 


iiu\j±^zou X / . 44 o.lOSe- 
12 107-132 


1562 


BL00522 


DNA polymerase family X 
proteins . 


BL00522C 11,90 G.SOOe- 
IS 412-436 BLO0S22B 
27.30 1.738e-16 364- 
410 BL00522A 25.52 
6,000e-16 279-326 
BL00522E 19.63 6 , 123e- 
14 502-532 Btj00S22F 
14.90 2.385e-13 551- 
575 


1563 


PF006S1 


BTB (also kncvsfn as BR- 
C/Ttk) domain proteins. 


PFOOesi 15.00 1.947c- 
11 46-59 


1564 


BIi00299 


Ubiquitin domain 
proteins . 


BL00299 28.84 2.823e- 
10 324-376 


1566 


BIj01013 


ujcyscerox-oxnding 
protein family proteins. 


BL01013b 26,81 8.594e- 
17 184-228 BL01013C 
9.97 4 .906e-12 14-24 


1567 


BIj0067B 


Trp-Aap (WD) repeat 
proteins proteins . 


BIi00678 9.67 3.400e-10"" 
378-389 BL00678 9,67 
5,800e-10 418-429 
BL00678 9.67 a.800e-10 
295-306 


1570 


BL00479 


Phortooi esters J 
diacylglycerol binding 
domain proteins. 


BL00479B 12.57 S.235e- 
17 297-313 BL00479A 
19.86 6,62Se-15 271- 
294 BL00479A 19.86 
2.667e-14 147-170 
BL00479B 12.57 6.294e- 
12 173-189 


1576 




OXYTOCIN RECEPTOR 
SIGNATURE 


PR00665G 12.36 4.673e- 
24 364-384 PR0066SD 
9.93 1.200e-22 138-155 
PR00665F 11.73 4.000e- 

55 'X'i'7 ^CtA o'D ft ^ ^ t?!^ ' 
/-JD^ fKUUoDl>C 

5.89 "l,000e-20 65-80 
PROD6650 5,29 4.337e- 
19 24-39 PR00665E 
5,60 2.929e-15 24fi-5fin 
PR0D66SA S.99 5.622e- 
XS 11-25 


1577 


DM00099 


4 kw AS5R REDUCTASE 
TERMINAL 

D IH YDROPTER I D INE . 


DM00099B 14.73 9.308e- 
10 127-137 


1579 


Bi>00S24 


Somatomedin B domain 
proteins. 


BL00524A 5.65 6.776e- 
14 52-73 


1580 


PD02894 


HYDROIjASE N4- PRECURSOR" 
k'j\Ult,XS!* 231GNAL BE . 


PD02894B 13.93 6.959e- 
16 182-215 PD02894A 
21.96 2.125e-10 57-103 


1581 


B1.00411 


kinesin motor domain 
proteins . 


BL00411C 15.04 S.292e- 
12 32-54 BI/00411H 
15.66 4.441e-ll 245- 
276 


1582 


PR00604 


CLASS lA AND IB 
CYTOCHROME C SIGMATURE 


PR00604A 11.13 2.440e- 
09 79-87 


1584 


PFOObSl ' - 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 l.OOOC" 
10 225-238 


1585 


DM01551 


kw OSTEOINDUCTIVE YOPM 
MEMBRANE OUTER. 


DM015S1C 14,62 9.455e- ' 
11 125-145 


1586 ] 


JM013S4 


kw TRANSCRIPTASE REVERSE 
II 0RF2. 


nM013S4S 11.61 1,150^ 

09 474-495 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1587 


PRO0072 


MALIC ENZYME SIGNATURE 


PR00072B 13.77 7.955e- 
33 180-210 PR00072A 
12.75 6.040e-2S 120- 
145 PR00072C 11.42 
4i . ^ooe-^^ ^ib — .^39 
PR00072D 10.77 3.4 00e- 
22 276-295 PR00072E 
10.54 1.360e-19 301- 

J U VUvKJU / diij lu . 4 5 

5.304e-19 433-450 
PR00072F 8.37 5.93Se- 
15 332-349 


1583 


BL00191 


Cytochrome bS £amily, 
heme-binding domain 
proteins. 


BL00191H 15.64 l.S37e- 
22 Sl-113 BL00191K 
17.38 9.027e-X2 398- 
442 


1590 




0 kw ZK632.12 YDR313C 
ENOOSOMAli III. 


DM01970B 8.60 7.716e- 
13 211-224 DM01970B 
8.60 2.lS7e-12 94-107 


1591 


DM00517 


5 kw NUCLEAR 60.7 NUPl 
CHKOMOSOME ■ 


DM00517B 10.96 €.62Se- 
16 1175-1193 r>M00517A 
8.21 l.OOOe-11 1015- 
1026 


lb92 


BI,00037 


Myb DNA- binding domain 
proteins repeat proteins 
proteins. 


BL00037B 15.92 3. 2506- 
27 116-142 BL00037A 
16,68 2.S00e-24 83-107 
BL00037A 16,68 3.250e- 
12 31-55 BI,00037B 
15.92 3.526e-ll 64-90 
BL00037C 16.86 9,6S4e- 
10 146-164 


1595 


BIi00028 


2inf? fitters 5* CO^AD t"\rr»« 

domain proteins. 


■DjLtUvUzo Xb . 0 / X .5X4e- 
09 110-127 


1598 


PF00628 


PHD- finger. 


t^rUub^o Xb . o4 3.250e- 
11 1667-1682 


1599 


PRO 0 0X4 


FIBRONECTIK TYPE III 
REPEAT SIGNATURE 


PR03014D 12.04 5.500e- 
0^ 980-995 


ISOO 


BL00518 


(RING finger), proteins. 


10 30-39 


1602 


BL00412 


NettrotnoduZ in (GAP-43 ) 
proteins. 


10 136-187 


1605 


PF00651 


BTB (also known as BR- 
C/TtJc) domain proteins. 


PP006S1 15,00 3.S71e- 
10 44-57 


1607 


BL00252 


Interferon alpha , beta 
and delta family 
proteins. 


BLOO^ti^a ifl a.o iz c<*7<a-''' 

• tj^v vA^jtn. Xw.Ti^ a.Oi>/e — 

23 20-57 BL00252B 
19.79 9.125e-16 58-109 


1610 


1^100215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 l.OOOe- 
06 61-94 


1511 


BL00904 


Protein 

prenyl transferases alpha 
aubunit repeat proteins 
proteins , 


BL00904C 8.98 7.353e- 
10 91-125 BL00904D 
1.47 6.018e-09 127-168 


16X2 


PF00168- 


C2 domain proteins. 


PF00168C 27.49 3,250e- 
09 365-391 


1613 


BJb00412 


Neuromodul in {GAP - 4 3 J 
proteins . 


flL00412D 16.54 G.OSle- 
09 932-983 BIi00412D 
16.54 7.1S3e-09 933- 
984 


1614 


BL00559 


Eukaryocic moX>iDdopterin 
ox i do reductases 
proteins . 


BL00559I 13.63 3.531e- 
25 S4-83 BL00S59K 
13.17 2,957e-18 197- 
224 BL0CSS9J 19.63 
6.870e-16 124-176 
BL005S9L 13.60 9-OOOe- 
16 266-204 


1615 


PD01427 


TRANSFERASE 
METHYLTRANSFERASE BI. 


PD01427B 22.45 3.02Se- " 
22 500-541 PD01427A 
19.94 8.773e-18 439- 
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SEQ ID NO; 


ACCESSION 
JMO . 


DESCRIPTION 


RESULTS* 








'*■ 


1616 


BLOOllS 


Eukaryotic RNA 
polymerase II 
heptapeptide repeat 
proteins . 


BLOOllSZ 3.12 7.485e- "~ 
09 152-201 BLOOllSZ 
3.12 9.603e-09 145-194 


1617 


BL00303 


S-lOO/ICaBP type calcium 
binding protein. 


iiL00303B 26,15 7.750e- 

bl-88 BLU0303A 
21.77 l,947e-3l 4-41 


1618 


BI.01254 


Fetuin family proteins . 


BL012S4F 10.02 8,7S4e- 
09 137-147 


1619 


PDOiaaS j PEPTIDE REDUCTASE 
j PROTEIN METHI. 
I 

i 


PD01888B 2S.10 1 , OOOe- 
40 47-97 PD01888C 
21,56 7.000e-30 125- 
155 PDOieSBA 12.84 
a.BOOe-15 7-23 


1621 


PR00239 


1 MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239E l.sa 3.455e- 
09 692-704 PR00239E 
1.58 4.Sa0e-09 €97-709 
PR00239E 1.S8 4.S80e- 
09 702-714 PR00239B 
1.58 5.193e"09 703-715 


1622 


PR00860 


VERTEBRATE 

METAbliOTHIONEIN 

SIGNATURE 


PR00860B 7.04 1.900e- 
18 27-41 PR00860C 
9.61 1.474e-14 41-51 
PR0O860A 5.46 1.720e- 
14 S-18 


1624 


PR00734 


MITOCHONDRIAI, BROWN FAT 
UNCOUPLING PROTEIN 
SIGNATURE 


PR00784D 15.86 8.027e- 
11 77-95 


1626 


BL00325 


Actin-depolymeri zing 
proteins. 


BL00325B 21,66 l,000e- 
40 93-139 BL0032SA 
24.83 6.786e-23 61-93 


1631 


BL00064 


L- lactate dehydrogenase 
proteins . 


BL00064B 23.57 l.OOOe- 
40 82-130 3L00064C 
17.28 l.OOOe-40 137- 
182 BL00064E 27.20 
l,000e-40 223-275 
BL00064F 2S.14 7,882e- 
36 286-331 BL00064A 
21.16 l.OOOe-33 22-60 
aij\j\jyj^%U . X7 b*pU0e — 
31 182-212 


1632 


PR00063 


RIBOSOMAL PROTEIN L27 
SIGNATURE 


PR00063B 15.24 9-700a- 
11,71 1.614e-09 34-59 


1634 


PRO 023 9 


MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239D 0.00 l.lOSe- 
11 36-49 PRD0239C 
3.51 2.538e-09 37-45 


1636 


BIi01210 


Caveolins proteins. 


BL01210B 13.92 9,53le- 
10 133-183 


1637 


BL.00982 


Bacterial -type phytoene 
dehydrogenase proteins . 


BL00982A 18.41 5.388e- 
11 11-43 


1639 


BL01183 


ubiE/COQS 

methyl transferase family 
proteins. 


BL01ia3B 21.31 8.144e- 
12 132-177 


1640 


PROOOIS 


GRAM-POSITIVE COCCUS 
SURFACE PROTEIN ANCHOR 
SIGNATURE 


PROOOISB 9.84 8,46Se- 
10 128-149 


1641 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320B 12.19 5.935e- 
11 364-379 PR00320A 
16.74 7.828e-ll 364- 
379 PR00320C 13.01 
2.800e-l0 279-294 
PR00320C 13.01 2.800e- 
10 364-379 PR00320B 
12.19 5-114e-10 279- 
294 PR00320A 16.74 
1.659e-09 279-294 



4 
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SEQ ID NO: 



1642 



1643 



ACCESSION 
NO, 



PF0OO23 



PR00169 



BI#00678 



DESCRIPTION 



Ank repeat proteins. 



POTASSIUM CHANNEL 
SIGNATUKK 



Trp-Asp (WD} repeat 
proteins proteins. 



RESULTS* 



PR003:^OA 16.74 
09 229-244 



PF00023A 16.03 6.464e-' 
09 114-130 



PR00169A 16.77 I.806e- 
11 74-94 



BL0067a 9.67 2.200e-i0 
109-120 BLOOeve 9,67 
5.737e-09 528-539 



1646 



1647 



1649 



1651 



1652 



16S5 



1656 



BLOlloe 



PR00380 



DM01242 



PD00126 



BIjOIISO 



BL00933 



BI1OO795 



BL00982 



BIiO09B2 



BL00741 



PRO0449 



Rxbosoraal protein L24 

proteins . 

K INKS IN HEAVY CHAIN 

SIGNATURE 



BLO1108A 20.33 7.366e- 
17 56-89 



3 THREONINE- -TRNA 
LZGASE . 



PROTEIN REPEAT DOMAIN 
TPR NUCLEA. 



Kinesin light chain 
repeat proteins. 



FGGY £amily 
carbohydrate kinases 
proteins . 



Involucrln proteins. 



Baccerial-type phytoene 
dehydrogenase protei ns. 
Bacterial -type phytoene 
dehydrogenase proteins . 



Guanine-nucleotide 
dissociation stimulators 
CDC24 family sign. 



TRANSFORMING PROTEIN P21 

RAs signature: 



PR00380A 14.18 9.270e- 
21 103-125 PR003flOD 
9.93 6 .308e-18 '386-408 
PR00380C 13.18 7.923e- 
16 332-351 PR0038OB 
12.64 6.6S7e-lS 292- 
310 



DM01242C 17.15 9.79le- 
37 340-381 DM01242E 
23.00 S.071e-31 463- 
505 DM01242D 23.29 
3 .925e-30 420-463 
DM01242B 23.57 8.054e- 
18 26S-314 DM01242F 
10.61 7,618e-14 526- 
540 



PD00126A 22.53 S.SOOe- 
10 13-34 



BIj01160B 19.54 6.720e- 
11 431-485 



BIi00933A 17.50 4.673e- 
12 11-35 BL00933E 
13.80 9.217e-09 456- 
472 



BIi007S5C 17.06 2.988e- 
10 70-115 



BIi00982A 18,41 7.750e- 
17 302-334 



BIi0O982A 18.41 
17 282-314 



7.750e- 



BIi00741B 14.27 1.391e- 
16 607-630 



PR00449A 13.20 7,93Be- 
11 114-136 



1659 



PR00910 



BL60972 



LUTEOVIRUS 0RF6 PROTEIN 

SIGNATURE 

UJaiguitin carboxyl- ~~ 
terminal hydrolases 
family 2 proteins. 



PR00910A 2.51 8.8e9e- 
10 442-455 



BL00972D 22.55 4.140e- 
12 376-401 BL00972E 
20,72 5.62i^e-09 446- 
468 



1660 



BL00406 



Actlns proteins. 



BL00406D 12.58 e.767e- 
15 188-243 



1662 



PROOIO5 



BL00280 



1663 



PR00319 



CXTOSINE-SPECIFIC DNA 

METHYLTRANSFERASE 

SIGNATURE 



Pancreatic trypsin 
inhibi tor ( Kuni 1 2 ) 
family proteins . 



PROOIOSA 10.36; 4,900e- 
13 1140-1157 PROOXOSB 
12.32 2.800e-12 1259- 
1274 PR00105C 10.86 
OOOe-lO 1305-1319 



BL002aO 24.61 3 .172e- 
33 3119-3163 



BETA G-PROTBIN 
(TRANSDUCINJ SIGNATURE 



PR00319D 11.64 6.625e- 
23 107-125 PR00319C 
13.41 5.714e-20 89-105 
PR00319A 15.27 5.286e- 
19 51-68 PR00319B 
11.47 8,2O0e-19 70-85 
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SEQ ID NO: 


ACCESSION 
NO, 


DESCRIPTION 


RESULTS* 


1664 


BL00018 


EF-iiand calcium- binding 
domain proteins - 


BL00018 7.41 S.OSOe-10 
489-502 


1667 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 8.500e- ' 
38 7-46 


1669 


BL01153 


N0Ll/NOP2/sun family 
proteins . 


BL01153D 19.69 1.188e- 
17 11S-14I BL011S3C 
13.67 8.977e-lS 66-80 
BL01153B 20.52 1.885e- 
10 13-37 


1671 


PR00678 


PI3 KINASE P85 ""' 
REGULATORS SUBUNIT 
SIGNATURE 


PR00678H 9.13 3.100e- 
10 114S-1169 


1672 


BLOOSSe 


Chromo aomain proteins. 


BL00598 14.45 S.SOOe- 
20 27-49 


1673 


PR00326 


GTPl/OBG GTP -BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8,75 8.329e- 
09 686-707 


1674 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PROG049D O.OO 7.S80e- 
11 343-358 PRD0049D 
0 00 1 286e-l0 t4:>-ic;7 


1676 


PR00747 


GLYCOSYL HYDROLASE 
FAMILY 47 SIGNATURE 


PROO 74 7H 12,76 8.636e- 
19 427-448 PR00747G 
14.50 2.286e-18 368- 
393 PR00747C 12.06 
7.S00e-ie 112-131 
PR00747A 14.05 4 . 600e- 
17 42-63 PR00747D 
15,23 8.759e-17 163- 
183 PR00747E 15.13 

PR0O747B 7.65 S.355e- 
13 75-90 PR00747F 
13.56 8.714e-10 311- 
328 


1677 


PRO 074 7 


GLYCOSYL HYDROLASE 
FAMILY 47 SIGNATURE 


PR00747H 12.76 8.636e- 
19 309-330 PR00747G 
14.50 2.286e-18 250- 
275 PR00747C 12.06 
7.500e-18 112-131 
PR00747A 14-05 4.600e- 
17 42-63 PR00747B 
7. 65 5.3SSe-13 75-90 
PR00747F 13. S6 e,714e- 
10 193-210 


1680 


BIi00€78 


Trp-Asp (WD) repeat 
proteins proteins , 


BL00678 9.67 4,€00e-10 
406-417 BL0067E 9.67 
6.684e-09 320-331 


1681 




Trp-Asp (WD J repeat 
proteins proteins. 


BL00678 9.67 4.6C0e-10 
329-340 BL0067B 9.67 
6.684e-09 243-254 


1683 


PR00326 


GTPl/OBG GTP-BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 1.346e- 
13 389-410 


1685 




RDCl ORPHAN RECEPTOR 
SIGNATURE 


PR00^4^H ^.32 4.188e- 
09 7SS-771 


1690 


Hiioiieo 


Kinesin lic(lit cJiciin 
repeat proteins. 


BL01160B 19.54 6.644e- 
09 75-129 


1691 


PH00456 


RIBOSOMAL PROTEXW P2 
SIGNATURE 


PR004S6E 3.06 7.281e- 
10 418-433 PR00456E 
3.06 7.281e-10 419-434 
PR004S6B 3.06 8.12Se- 
10 420-435 


1692 


PR004S^ ■ ■ 


RIBOSOMAL PROTEIN P2 " * 
SIGNATURE 


PR004S6E 3.06 7.28le- ' 
10 487-502 PR00456E 
3.06 7.2aie-10 488-503 
PR00456E 3.06 8.125e- 
10 489-504 


1693 


BL00674 


AAA-protein family 
proteins . 


BL00674C 22-60 a.043e- 
24 274-317 BL00674B 
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>i L. C iL S 23 1 UN 
NO. 


DESCRIPTION 


RESULTS* 








4.46 4 .OOOe-23 241-263 
BL00674D 23.41 8.S60e- 
18 338-385 BL00674E 
15.24 1.720e-15 414- 
434 


1697 


PR00409 


PHTHALATE DIOXYGENASE 
REDUCTASE FAMILY 
SIGNATURE 


PR00409F 12.70 4.388e- 
10 427-447 


1698 


PR00466 


CYTOCHROME B-24S HEAVY 
CHAIN SIGNATURE 


PR00466C 10.17 3.443e- 
13 187-208 PR00466B 
5.03 5-500e-ll 162-186 
PR00466P 9.16 6.l59e- 
09 498-517 


1699 


BIi00028 


2inc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 9.217e- 
12 283-300 BL00028 
16.07 3 .769e-ll 2SS- 
272 BL00028 16.07 
5.154e-ll 171-188 
BL0002e 16.07 S.SOOe- 
11 227-244 BL00O28 
16.07 1.600e-10 199- 
215 


1700 


BL01019 


ADP-ribosylation factors 
family proteins. 


BL01019A 13.20 3.348e- 
15 62-102 BL01019B 
19.49 4 .OOOe-lS 107- 
162 


1703 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 2.484e- 
12 200-239 


1707 


PRO 010 9 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 4.SSee- 
14 134-153 


1710 


PH00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 2.565e- 
10 116-130 PR00019B 
11.36 4.600e-09 113- 
127 PR00019B 11.36 
7.120e-09 204-218 


1711 




ww/rspb/wwp clotnam 
proteins. 


BL01159 13,85 6.S23e- 
11 232-247 BL01159 
13.85 S.408e-10 613- 
628 


1712 


PF00023 


Ank repeat proteins. 


PF00023A 16 .03 7 . OOOe- 


17L3 


PF00642 


Zinc finger C-x8-C-xS-C- 
x3 -H type {and sinixlar) . 


PF00642 11.59 9,S50e- 


1714 


PF00642 


Zinc finger C-x8-C-x5-C- 
X3-H type (emd similar) . 


PF00642 11.59 9.550e- 
11 230-241 


1715 


BL01115 


GTP-binding nuclear 
protein, ran proteins. 


BL01115A 10.22 7.129e- " 
09 7-Sl 


1718 


BL003S3 


HMGl/2 proteins. 


BL003S3C 14,83 6.018e- 
10 136-183 BL00353B 

11.47 8 R(\(it>'-n<i Ri?..1 


1719 


BL00412 


Neur omodul in { GAP - 4 3 ) 
proteins. 


BL00412D 16.54 S.408e- 
09 432-483 


1721 


BIi00038 


Myc-type, 'heiix- loop- 
helix' dimerization 
domain proteins. 


BL00038B 16,97 a.448e- 
12 79-100 BLO0038A 
13.61 4.000e-ll 52-68 


1723 


PD00S67 


PROTEIN RNA- BINDING RNA 
REPEAT HYD. 


PD00S67C 9.17 S.SOOe- - 
09 418-428 


1724 


BI701279 


Protein-L- 
ieoaepartate (D- 
aapartate) 0- 
methyl transferase signa. 


BL01279A 24.27 5.663c- 
12 233-281 


1728 


BL00018 


EF-hand calciuo-binding 
domain proteins. 


BLOOOIB 7.41 2.0S9e-ll 
73-86 * BL00018 7,41 
4.176e-ll 157-170 


1730 


B1.00594 ■ * ■ 


Aromatic amino acids 
permeases proteins. 


BL00594A 16.75 1,089©- 
09 17-61 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


JL731 


BI.0L160 


Kiriesin light chain 
repeat proteins . 


Bt,01160B 19.54 9.67^e- 
10 296-350 


1732 


BI.01160 


Kinesin light chain 
repeat proteins . 


BL01160B 19.54 9.676e- 
10 316-370 


1733 


PF0O850 


His tone deacetylase 
family . 


PF00850F 15.70 4.349e- 
22 246-279 PFD0850D 
14.76 6,aS0e-20 177- 
201 PF008SOE 8.88 
8.69le-ia 209-235 
PFO0850G 22.75 4.098e- 
14 281-323 


1734 


BL00354 


HMG-I and HMG-Y DNA- 
binding domain proteins 
(Ahook) . 


BL00354C 6-61 5,932e- 
09 292-307 


1735 


DM00179 


w KINASE ALPHA ADHESION 
T-CELL. 


DM00179 13.97 5.2636- 
10 492-502 


1743 


PR00449 


TRANSFORMING PROTEIN P2i"" 
RAS SIGNATURE 


PR00449A 13-20 l.l88e- 
11 5-27 PR00449D 
10.79 2 241e-10 m<i- 
123 PR00449E 13.50 
9.289e-10 144-167 


1744 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13 20 1 ifift**-" "' 
11 s-27 PR00449D 
10.79 2.2416-10 109- 
123 PR00449E 13.50 
9.289e-10 144-167 


1745 


BIi00720 


Guanine-nucleotide 
dissociation stimulators 
CDC25 family sign. 


BL,00720B 16.57 8.297e- 
15 136-160 


1746 


PR00081 


GLUCOSE/ RIB ITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR00081B 10.38 6.727e- 
11 45-57 PROOOftl F 
17,54 3.935e-10 150- 
168 


1747 


BL00439 


Acyl trans f erases 
ChoActase / COT / OPT 
family proteins. 


BL00439H 18-24 8-435e- 
14 65-91 BL00439G 
13.40 2.89Se-12 3-14 


1749 


PR00819 


CBXX/CFQX SUPERFAMILY 
SIGNATURE 


PR00819B 10.83 7.1S8e- 
IX 4-20 


1751 


PD00P66 


PROTEIN ZINC-FINGER ' 
METAL- HINDI. 


PD00066 13.92 3.400e- 

14 33-46 PDoooee 
13.92 l,000e-13 89-102 
PD00P66 13.92 7.000e- 
13 61-74 PD00066 
13,92 6.571e-12 117- 
130 


1753 


Brj01013 


Qxysterol -binding 
protein family proteins . 


BL01013D 26-81 6.S16e. 
18 33-77 


1754 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790I 20.01 2,393e- — 
09 490-521 BL00790I 
20.01 2.821e-09 60-91 
BL00790I 20.01 6,357e- 
09 287-318 


X / DO 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER M3rAL- 
BINDING NU. 


PD01066 19,43 9.7S0e- " 
35 10-49 


1750 


DM00406 


GIiIADIN. 


DM00406 7.73 7.600e-09 
653-666 


1762 


PD02929 


ADHESION GLYCOPROTEIN 
PRECURSOR I. 


PD02929A 28.27 4.S29e- — 
09 224-278 


1765 


PR00326 


GTPl/OBG GTP-BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 5.950e- 
11 146-167 


1775 


PF00023 


AnK repeat proteins. 


PF00023A 16.03 3.077e- 
14 523-539 


1776 


BI.00942 


gipT family of 
transporters proteins. 


BL00942F 15.07 4.343e- 
10 371-389 BL00942B 
20.36 B.040e-09 94-137 


1777 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2.373e" 
09 279-312 
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ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1778 


BL00084 


Copper type II, 

A scorba te - dependent 

iTionooxygenases proteins . 


BL00084D 25.11 3-700e- 
20 169-224 BL00084B 
24.26 8.134e-16 10-S8 
BIj00084C 27.71 8.412e- 
11 107-158 


1779 


BL01013 


oxy s terol-Jbxnding 
protein family proteins. 


BL01013P 2S,81 3.7S8e- 
18 611-655 BLD1013A 
25.14 2.831e-lS 344- 
380 BL0I013C 9,97 
6.308e-13 435-445 
BIiaiai3B 11.33 3.717e- 
12 409-420 


1783 


BL00741 


Guanine-nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14,27 8.l38e- 
13 492-515 


1784 


BL00741 


Guanine -nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 8.138e- 
13 492-515 



* results include in order: accession number subtype; 
signature in amino acid sequence, 
TRADOCS: 1 4 1 6223 J (%CRJOI !.DOC) 



raw score; p- value; postion of 
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TABLE 4 



SEQ ID 
NO ; 


fcfKm NATOE 


DESCRIPTION 


p -value 


PFAM 
SCORE 


2 




Immunoglobulin domain 


2.1e-32 


109.5 


3 


pkinase 


Eukaryotic protein kinase 
domaxn 


1.3e-29 


110.7 


4 


2f-C2H2 


2inc finger, C2H2 type 


1.6e-21 


84 .9 




fn3 ' 


Fibronectln type III domain 


0 


1097.1 


6 


fn3 


Pibronectin type III domain 


0 


1035.0 " 


7 


f n3 


Fibronectin type III domain 


0 


1090.4 


8 


fn3 


Flbronecttn type III domain 


0 


1097.1 


9 


TBC 


TBC domain 


4e-40 


14 6,7 


10 


p4 50 


Cytochrome P4 50 


9.5e-l7 


62.0 


12 


ank. 


Ank repeat 


6e-20 


79,7 


14 


ig 


Immunoglobulin domain 


1.7e-05 


22 .7 


IS 


zf-MYND 


MYND finger 


1.3e*06 


35.4 *" 


16 


zf-MYND 


MYND finger 


1.3e-06 


35 .4 


17 


2f-C2H2 


Zinc finger, C2H2 type 


1.7e-39 


343.9 


18 


CAP_GLY 


CAP-Gly domain 


1.2e-25 


98 .7 


20 


IMPDH_C 


IMP dehydrogenase / GMP 
reductase C terminus 


1. 6e-119 


410.5 


21 


IMPDH_C 


IMP dehydrogenase / GMP 
reductase C terminus 


4 .3e-102 


352.6 ■ 


22 


pkinase 


Eukaryotic protein kinase 
domain 


2 .4e-79 


277.0 


23 


pJcinase 


Eukaryotic protein kinase 
domain 


8 .4e-74 


258 .6 


25 




RNA polymerase alpha subunit 


0 


1077.7 


26 


Clq 


Clq domain 


l,9e-ic 


44.4 


27 


Rii)Osomal_Xj2 
3 


Ribosomal protein L23 


7.8e-32 


111.2 


28 


Ribosomal Xj2 
3 


Ribosomal protein L23 


le-29 


104.2 


30 


2£-A20 


A2 0-like zinc finger 


1 .5e-10 


48.5 


31 


Zf-A20 


A20-like zinc finger 


l-5e-10 


48. S 


32 


FMN_dh 


\ FMN- dependent dehydrogenase 


5 ,4e-179 


608.1 


34 


PID 


Phosphotyrosine interaction 
domain (PTB/piD) 


3 .8e-59 


209,9 


35 




Immunoglobulin domain 


1.4e-13 


48.8 


36 


ig 


Immunoglobulin domain 


1 .4e-l3 


48.8 


4 0 


kinesin 


Kinesin motor domain 


6 -7e-76 


26S.6 




Ets 


Ets -domain 


X.4e-56 


182.1 






Ets -domain 


1.4e-S6 


182.1 




XjRR 


Leucine Rich Repeat 


l-7e-13 


58.3 


4 8 


2£ -C2H2 


Zinc finger, C2H2 type 


2 .3e-162 


552.8 


49 


ITAM 


Immunoreceptor tyrosine -based 
activation mot 


l,4e-05 


31.9 


50 


CJCH-2 


Ubiquitin carboxyl- terminal 
hydrolase family 


l,le-26 


102.0 


sx 


UCH-2 


^•uj-tiuitan carooxyx - nerrainax 


1 ,le-26 


102.0 


S2 


ras 


Ras family 


6 . Se-45 


162 . 3 


53 


• PRK 


Phosphoribuiokinase 


2,le-65 


230.7 


54 


myJt)__DNA- 
bindlng 


Myb-like DNA-blnding domain 


0.096 


IS. 2 


SS 


voltage_CLC 


Voltage gated chloride channels 


3.3e-186 


631.9 


56 


sugar_tr 


Sugar (and other) transporter 


0.000X5 


-64.3 


57 


TBC 


TBC domain 


2 .2e-37 


137,6 


bS 


ank 


Ank repeat 


5.9e-25 


96.3 


59 


ank 


Ank repeat 


5.9e-25 


96,3 


67 


PMP22_claudi 
n 


PMP-22/EMP/MP20/Claudin family 


7.9e-49 


175.6 


68 


C2 


C2 clomain 


7.9e-S4 


192.2 


69 


C2 


C2 domain 


2.3e-54 


194 .0 


7^ 


Kelch 


Kelch motif 


9 -4e-99 


341.5 




3-g 


Immunoglobulin domain 


8 .2e-28 


94.7 


73 


pkinase 


Eukaryotic protein kinase 


ae-69 


242.1 
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SEQ ID 
NO: 



76 



83 



"88" 
09 
"92" 
"93" 



96 



98 



102 



PPAM NAME 



pklnase 



^ 

C4 Topoisom 



Pepta.dase_S9 



DESCRIPTION 



Eukaryotic protein kinase 
domain 



Topoisomerase DNA binding C4 
zinc £ing 



fn3 



SH2 



WD40 



lamxnin G 



AMP -b inding 
pkinase 



pkinase 



adh short 



Icinegin 



IRS 



AAA 



Prolyl ollgopeptidase family" 



FibronectJLn type m domain^ 
Src homology domain 2 



immunoglobulin domain 



WD domain^ G-beta repeat 



Liamlnin G domain 
AMP -binding enzyme 



Eukaryotic protein kinase 
domain 

Eulcaryotic protein kinase 
domain 



short chain dehydro genase 
Kinesin moto r domain 
PTB domain (IRS- 



-1 type) 



p -value 



2.8e-38 



4 .3e-i0 



3-le-22 



0.0091 



2.1e-21 



6.1e-27 



2.4^-13 



1.4e-59 



2,6e-£l 



2e-61 



2,2e-86 



ATPases associated with various 
cellular act 



574e-36 
6.8e-0S 



PFAM 
SCORE 



36. 8 



183 .2 
67.7 



14 .0 



84,6 



98. S 



-37.2 



211.4 



183 .9 



217.5 



300.4 
13i.O" 



-5.2 



106 



109 
113 



120 



EuJcaryotic protein kinase 
domain 



2.7e-73 



FVVE 



Cyt_reductas 



Ras family 
KYVE zinc fingeiT 



8.3e-24 



'2£-C2H2 



FAD/NAD- binding Cytochrome 
reductase 



5.4e-27 



7.7e-6l 



pkinase 



Zinc finger, C2H2 type 



PH 



Eukaryotic protein kinase" 
domain 



2.3e-l22 



4e-88 



iipocalin 



PH domain 



pkinase 



Lipocalin / cytoooXic fatty- 
acid binding pr 



3.le-ll 



2.4e-14 



WD4a 



Eukaryotic protein kinase 
domain 



4 .Se-20 



wb domain^ G-beta repeat" 



XF5_elF4_eIF 



WD domain ; G-beta repeat" 



2,4e-l4 



elF4-gamma/eIFS/eI?2-epsilon 



2 .4e-14 



le-32 



256, S 



100,7 



21S.5 



420.0 



306.2 



45.2 



?3Ts" 



76.3 



errr" 



61,1 



122,2 



129 



mlto^carr" 

PP2C~ 



ATP1G1_PLM_M 
AT8 



pfkB 



Immunoglobulin domain" 



Mitochondrial ca rrier proteins 



6 .Se-08 



Protein phosphatase 2C 



3e-l6 



ATPIGI/PLM/MATB family" 



pfkB family carbohydrate kinase 



2.2e-71 



3.1e-20 



4.Se-42 



30.6" 



58. 6 



250.6 



80.6 



} 134 



13 5 
136 



Acyl CoA binding protein 



IQ 

AtPlGl_ 

AT8 

WH2 



RNA recognition motif. 



4.6e-22 



IQ calmodulin- binding motif" 



ATPlGl/PIiM/MATS family 



1.2e-3X 
2 .6e-08 



86.7 



118.5 



41.0 



9 .3e-22 



85. 7 



141 



143 



146 



148 
149 



zf-C2H2 



wiskott Aldrich syndrome 
homology region 2 



0.0067 



23 .1 



Peptidase_S2 
6 



Zinc finger, C2H2 type" 
Signal peptidase I 



1.7e-82 



5,7e-10 



35.7 



arf 



KRAB 



ADP-ribosylation factor family 



DaF6 
PDEase 



KRAB box 

Integral membrane protein DUF6 " 



1.2e-39 
7,3e-30 



0,096 



145,2 



112,6 



8.0 



5 '-cyclic nucleotide 
phosphodiesterase 



3.ee-80 



231.1 



151 
153 



S4 — 
tRNA-synt Id^ 



^4 domain 



l.le-08 



42.3 



154 



155 



157 



Cyt_reductaa 
e 



tRNA synthetases class I (R) 



FAD/NAp-£nnding Cytochrome 

reductase 

Ras family ~~ 



3 .8e-103 



356,1 



7,8e-60 



212 .2 



Act in 



3.6e-28 



3,8e-26 



107.0 
8771 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p -value 


PFAM 
SCORE 


1S8 


Jacalin 


Jacalin-like lectin domain 


0.09 


-24.9 


160 


Zn^carbOpepc 


Zinc carboxypeptidase 


Se-138 


471.9 


165 


pkinase 


Eukaryotlc protein kinase 
domain 


5,le"67 


236 .1 


167 


2;f-C3nC4 


Zinc finger, C3HC4 type (RING 
finger) 


S.3e-07 


27,0 


168 


Ribosomal_Sl 
5 


Ribosomal protein SIS 


l,le-06 


29.0 


169 


DEAD 


DEAD/DEAH box helicace 


le-48 


157.0 


171 


DUF59 


Domain of unknown function 
DUFS9 


0.07 


-17.4 


172 


pkinase 


Eukaryotic protein kinase 
domain 


3 .7e-15 


58.6 


173 


globin 


Globin 


4,6e-18 


67,4 


174 


WW 


WW domain 


7.3e-06 


32 .9 


175 


ras 


Ras family 


le-31 


118 . B 


178 


ATP1G1_PLM M 
ATS 


ATP1G1/PLM/MAT8 family 


2.Se-17 


71.0 


179 


2f-C2H2 


Zinc finger, C2H2 type 


1 .Se-99 


344.2 


180 


Clq 


Clq domain 


8.ae-72 


251. 9 


190 


Y^jpho spha tia s 
e 


Protein- tyrosine phosphatase 


4 .9e-287 


967. 0 


i91 


efhandL 


EP hand 


7.5e-l6 


66.1 


193 


pklnase 


Eukaryotic protein kinase 
domain 


6.5e-82 


285.6 


194 


bromodomain 


Bromodomain 


5.ae-31 


111 .4 


195 


PALP 


Pyridoxal -phosphate dependent 
enzyme 


2,5e-64 


227 .1 


1^7 


DnaJ 


DnaCT domain 


1.6e-38 


141.4 


199 


RrnaLAD 


Ribosomal RNA adenine 
dimethylases 


0, 00018 


16.9 


200 


a c 1 d__phospha 
t 


Histidlne acid phosphatase 


2.Se-10 


37V2 


201 


WH2 


Wiskott Aldrxch syndrome 
homology region 2 


0.00048 


26.9 


204 


VATP- 
synt_AC39 


ATP synthase (C/AC39) subunlt 


1.3e-lS9 


543 .7 


205 


vATP- 
synt_AC39 


ATP synthase (C/AC39) subunit 


i.6e-139 


476.9 


206 


ldl_recept_a 


Low-density lipoprotein 
receptor domain 


2.4e-25 


97,6 


209 


ank 


Ank repeat 


1,46-19 


78.4 


210 


Rhomboid 


Rhomboid family 


0. 0035 


1.2 


211 


Clq 


Clq domain 


1.6e-70 


247,7 


212 


UQ_con 


Ubi qui tin "Conjugating enzyme 


7,4e-74 


258. e 


213 


UQ_con 


Ubiqui tin -conjugating eniiyme 


le-S3 


191.9 


215 


DEAD 


DEAD/DE7VH bOX hellcase 


1.8e-43 


140.4 


216 


P'MPi2 2__Claudi 
n 


PMP-22/EMP/MP20/Claudin family 


4.5e-21 


83 .4 


2 IB 


Glycos trans 

•p o 


Glycosyl transferases 


4e-21 


83 .6 




^9 


- . 

Intmunoglobulln domain 


0 . 092 


10 .7 


222 


WD4Q 


WD domain, G-beta repeat 


7,4e-23 


89.4 


224 


TPR 


xtris. 1/cHnaxn 


1 . 2e-08 


42.x 


225 


DnaJ CXXCXGX 
G 


DnaJ central domain (4 repeats) 


l,5e-38 


141,5 


226 


DnaJ_CXXCXGK 
G 


DnaJ central domain (4 repeats) 


1.5e-38 


141.5 


229 


HSP70 


Hsp70 protein 


2 .4e-54 


194,0 


230 


GSHPx 


Glutathione peroxidases 


3 .4e-47 


170.2 


231 


tsp_l 


Thrombospondin type i domain 


0 .0075 


17.1 


233 


cyclin 


Cyclin » 


4 ,6e-144 


492.0 


234 


ras 


Ras family 


4.8e-50 


"179.7 


235 


LRR 


Leucine Rich Repeat 


1 .2e-30 


115,3 


236 


LRR 


Leucine Rich Repeat 


6 .7e-29 


109.4 


237 


PDZ 

t 


PDZ domain (Also known as DHR 
or GLGP) . 


1.7e-09 


45.0 
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SEQ ID 
NO: 


PFAM NAM3 


DESCRIPTION 


p- value 


PFAM 
SCORE 


244 
245 


clCMP_cyt_dea 
m 


Cyuidine and deoxycytidylate 
deaminase 

Immunoglobulin domain 


2 .Se-05 


31.1 


248 
250 


wnt 


wnt family of developmental 
signaling protei 
Mitochondrial carrier proteins 


6 ,7e-oe 
9.1e-270 ' 


30.5 
742.5 


2S4 


adenylatekin 
ase 


Adenylate kinase 


1.3e-S5 ""' 
1 .8e-14 


193.6 
55.7 


2S5 
256 


Cation ef f 111 

X 

SH3 


Cation efflux family 
SH3 domain 


2.8e-33 


124.0 


257 


Aa__trans 


'transmembrane amino acid 
transporter protein 


3 . 9e-14 
2.6e-52 


60.4 
187.2 


258 
259 


adenylatekin 

ase 

HIT 


Adenylate kinase 
HIT family 


2.1e-110 


380.2 


260 
2€2 


Bacterial PQ 
Q 

p rot ea some 


PQQ enzyme repeat 
Proteasome A-type and B-type 


8 . 2e- 07 
i .6e-15 


25,3 
65.0 


267 
270 


pkinase 
±i lament 


Eukarvotlr* irtT*rif*>ir> i^^t^^^a 
domain 

Intermediate f -lament proteins 


6.5e-64 
6 . 3e-27 


225.7 
101.0 


2 7X 
277 


Choline Jcina 
se " 
Ribosomai S7 


Choline/ethanolamine kinase 
Rxbosoraal protein S7p/s5e 


3 .2e-150 
2e-67 


512.5 
237.4 


279 

280 
281 


pkinase 

WD40 
WD40 


Eukaryotic protein kinase 
domain 

WD domaxn, G-beta repeat 


3.3e-20 
3 .3e-77 

7 . 8e-73 
7.8e-73 


80.6 
269.9 

255 .4 
255.4 


284 
287 
291 


zf-DHHC 
Kxonuc lease 
SAM 


DHHC zinc finger domain ~ 

Exonuclease ™ — — — — 
SAM domain tSterile aipha 
motif) 


4.6e-24 
1 . 4e-67 
0 .034 


93-4 
238 ,0 
11.2 


292 

294 
295 


SAM 

2t-C2H2 
2t-C2H2 


SAM domain (Sb^T--i 1 ** r^w» 
motif) 

Zinc finger, C2H2 type 
Zinc finger, C2H2 type 


0.034 

1 .4e-29 
2,2e-12S 


II. 2 

III. :? 

430.0 


296 
297 
302 


mi to carr 
HMG box 
Glycos trans 
f 4 


Mitochondrial carrier proteins 
HMG (high mobility group) box 
Glycosyl transferase 


4.1e-59 
Se-87 


205.5 
109. 4 
302, 5 


304 
305 


tRNA-synt 2 
iCRAB 


tRNA synthetases class II (D, K 
and N) 
KHAB box 


l.le-84 
2e-44 


294.8 
161.0 


306 
308 


ritn 
'>tni_i 


RNA recognition motif. 
/ transmembrane receptor 
(rhodopsin family) 


2.7e-44 
i7,2e-39 


160.6 
126.1 


309 

311 
312 
"313 


DNA_jp oXyme ra 
seX 

ig 

Kts 


DNA polymerase X family 

F-±>ox domain. 
Immunoglobulin domain 
Ets-domain 


2.4e-64 

9.5e-08 
6,8e-19 
8.ae-60 


227 . 2 

39.2 
65.9 
192 .3 


31S 
317 
318 
320 


Kelch 

■SH ~ 

sugar cr ^ ~ 
pJclnase 


kelch motif 

ADP-ribosylation factor family 
Sugar (and other) transporter 
Eukaryotic protein kinase 
domain 


l.3e-l06 
3.2e-3S 
O.O003 
8.le-83 


367.6 
130.4 
-73 ,1 
288 .6 


322 

324 
326 


pkinase 
XI ink 

AHiU ■ , 


Eukaryotic protein kinase 
domain 

Extracellular link domain 
ARID DKA binding domain 


4,9e-81 

4.Se-X43 
5.1e-37 


282,6 

331. S 
136.4 


327 
328 
331 


HMG_box 

cadherin 

::hronio 

( 


HMG {high mobility group) box 
cadherin domain 
'ciiromo* (CHRromat in "' 
Organization Modifier) 


6,7e-29 
9 .le-81 
4e-18 


109 .4 
281.9 
66.7 


333 J 


Peptidase M2 ( 
2 


jriycoproteaee family 


I-2e-l36 


467 .4 
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NO: 


PFAM NAME 


DESCRIPTION 


p- value 


PFAM " " 
SCORE 


335 


vwa 


von Willebrand factor type A 
domain 


2.3e-07 


37.9 


339 


ras 


Ras family 


7 .8e-07 


-59.1 


340 


2:f-C2H2 


Zinc finger, C2H2 type 


8 .2e-64 


225.4 


342 


2f-C2H2 


Zxnc finger, C2H2 type 


2 .4e-8S 


297.0 


343 


^3 


Immunoglobulin domain 


0.0005 


18.0 


346 " 


pkinase 


Eukaryotic protein kinase 
domain 


6 ,5e-65 


229,1 


347 


pkinase 


Eukaryotic protein kinase 
domain 


6 .5e-6S 


229.1 


351 


EGF 


EGF-like domain 


8.Se-20 


79.2 


352 


ank 


Ank repeat 


2 .5e-101 


350,0 


354 


TBC 


TBC domain 


S.ie-lS 


63 .3 


355 


PHD 


PHD- finger 


3 .2e-07 


37.4 


358 


DUF6 


Integral membrane protein DUP6 


0 , 033 


15 . 8 


359 


Zf-C2H2 


2inc finger, C2H2 type 


7.4e-20 


79.4 


361 


ank 


Ank repeat 


6 .6e-34 


126.1 


362 


ArfGap 


Putative GTP-ase activating 
protein for Arf 


4 . 7e-53 


189-7 


363 


efhand 


EF hand 


5 ,4e-10 


46 * 6 


367 


LRR - 


Leucine Rich Repeat 


S . Se-44 


158 . 9 


368 


laminin G 


Laminin G domain 


l-Se-33 


121 . 7 


369 




Protein phosphatase 2C 


5 .3e-20 


73 . 9 


3 72 


LIM 


hlH domain containing proteins 




57 , X 


373 


KRAfi 


KRAB box 


4 . 8e-23 


90.0 . 


3 76 


ion_ trans 


Ion transport protein 


2 .9e-09 


-4.2 


377 


Beach 


Beige/BEACH domain 


4 , 9e-208 


7^4 , 5 


380 


pkinase 


Eukaryotic protein kinase 
domain 


1.6e-94 


327 .5 


381 


AMP -binding 


AMP-biniding enzyme 


X,4e-07 


-140.3 


382 


HECT 


HECT-domain (ubiquitin- 
transferase) . 


1.3e-07 


-13 .5 


384 


ank 


Ank repeat 


2.5e-l-01 


350, 0 


386 




Immunoglobulin domain 


9.5e-0S 


23 .6 


368 




Zinc finger, C2H2 type 


1.7e-42 


154.6 


389 


ig 


Immunoglobulin domain 


2.8e-15 


1" S4 .3 


390 


mi to carr 


Mxtochondrial carrier proteins 


3.Se-67 


233 .2 


3^2 


TPR 


TPR Domain 


6.1e-17 


69.7 


393 


SH3 


SH3 domain 


3 .5e-09 


43 ,9 


394 


AAA 


AXPases associated with various 
cellular act 


4.1e-21 


83 .6 


396 


spectrin 


Spectrin repeat 


2.1e-fi7 


23 7,3 


397 


zt-C2H2 


Zinc finger, C2H2 type- 


0.0066 


23.1 


399 


t:n3 


Pxbronectin type III domain 


4 .le-102 


352.6 


400 


WD40 


WD domain, G-beta repeat 


0.00049 


26.8 


401 


El^dehydrcg 


Dehydrogenase El component 


3e-119 


409.6 


402 


£n3 


Pibronectin type III domain 


0 


1719.6 


404 


LRR 


Leucine Rich Repeat 


2.1e-10 


48 .0 


405 


cataherin 


Cadherin domain 


a.ie-81 


281.9 


406 


Zf -cxxc 


CXXC zinc finger 


5e-is 


63.4 


410 


RhoGBF 


RhoGEF domain 


l.le-23 


92.1 


411 


F-box 


P-box domain. 


4 .2e-06 


33.7 


412 




SNF2 and others N- terminal 
domain 


5.8e-16 


61,6 


415 


CFSa9e_L_cha 
in 


Carbamoyl -phosphate synthase 
(CPSase) 


1.5e-172 


586.6 


418 


LRR 


Leucine Rich Repeat 


3 ,8e-24 


93.6 


419 


DEm 


DEm (AEX-3) doinain 


2e-58 


207. S 


420 


RasGEF 


RasGEF domain 


8.1e-43 


1^5.7 


421 


ank 


Ank repeat 


1.4e-lS3 


523,7 


424 


G -patch 


G-patch domain i 


le-19 


78.9 


425 


pkinase 


Eukaryotic protein kinase 
domain 


2.2e-31 


117,1 


426 


iplexin repea 
t 


Plexm repeat 


0.0023 


24.6 


427 


Plexin^repea 


Plexin repeat 


0.0023 


24.6 
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NO: 


E*KAM NAME 
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p-value 


PFAM 
SCORE 




t 








429 


2f-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


8.6e-li 


39,2 


431 


DEAD 


DEAD/DEAH box helxcase 


le-66 


214.0 


432 


SH3 


SH3 domain 


3.4e-16 


S7.2 


433 


GTP CDC 


Cell division protein 


2.1e-ll4 


3 93 . 5 


436 


tJoiiagen 


Collagen triple nelix repeat 
{20 copies) 


4 .6e-194 


658.1 


438 


Ricin B lect 
in 


Similarity to lectin domain at 
ricin b 


0.0085 


"^10 .5 


441 


Alpha^adapti 
n_C 


Alpha adaptin carboxyl- terminal 
domai 


1.2e-2S6 


866.0 


442 


Alpha adapti 


Alpha adaptin carboxyl- terminal 
domai 


1.8e-235 


795 . 7 


443 


PD2 


PDZ domain (Also knovm as DHR 
or GLGF) , 


l-9e-65 


230 , 9 


445 


LON 


ATP -dependent protease La (LON) 
domain 


0.00012 


-17.1 


446 




Immunoglobulin domain 


0,00011 


20.1 


,451 


sushi 


Sushi domain {SCR repeat) 


1.4e-18 


75. 2 


4 52 


£n3 


Fibronectin type III domain 


1 . Se-06 


35 2 


454 


pyridoxal de 
C 


Pyridoxal - dependent 
decarboxylase conse 


6 , 3e-14 


50 * 3 


4S6 


kinesin 


Kinesin motor domain 


4 .9e-217 


734.4 


457 


neur^chan 


Neurotransmitter-gated ion- 
channel 


le-175 


597.1 


458 


Josephin 


Josephin 


0 .0002 


16,7 


468 




hZIP transcription factor 


1.7e-07 


31,8 


470 


NTP^transfer 
ase 


Nucleotidyl transferase 


6 .3e-06 


-26.3 


471 


WD40 


WD domain, G-beta repeat 


2e-28 


107.9 


473 


-mr 


LIM domain containing proteins 


0 .00021 


20 , 7 


477 


zf-RanBP 


Zn-f inger in Ran binding 
protein and others. 


0 .028 


21 . 0 


479 


WD40 


WD aomain, G-faeta repeat ~ 


6,5e-18 


73 .0 


480 


KRAB ~ 


KRAB box 


le-31 


118.8 


j 481 


ArfGap 


Putative GTP -ase activating 
protein for Arf 


8 .4e-66 


232 , 0 


485 


SH2 


Src homology domain 2 


0.011 


11.4 


486 


ClCf 


Clq domain 


4.3e-74 


259.6 


487 


dsxm 


Double-Btranded RNA binding 
motif 


l.le-47 


171.9 


489 


zt-C2H2 


Zinc finger, C2K2 type 


4.8e-153 


521.9 


490 


Alpha_adapti 
n_C 


Alpha adaptin carboxyl- terminal 
domai 


3.4e-222 


751.6 


492 


SKI 


shikimate kinase 


1.2e-10 


48.8 


497 


ENVjpolyprot 
ein 


ENV polyprotein {coat 
polyp rotein) 


.2.6e-.22 


77.6 


498 


abhydrolaae 
2 


Phospholipase/Carboxylest erase 


0.041 


-48.1 


500 




NA recognition motif. 


5.46-34 


126.4 


SOI 


WW" 


WW domain 


4.6e-.l8 


73.4 


502 


ig 


Immunoglobulin domain 


l.le-10 


39.5 


504 


ahhycirolase 


alpha /beta hydrolase fold 


0.04S 


-3.^ 


SOS 


vwa 


voji wixieorana factor type A 
domain 


7.1e-62 


219.0 


508 


Na_K ATPase 
C 


Na+/K+ ATPase C- terminus 


2.3e-145 


496.3 


509 


Exonuclease 


Exonuclease 


1.3e-S6 


201 , 5 


510 


Glycos trans 


Giycosyl transferases group l 


2,9e-06 


27.0 


511 


Giycos trans 
f_l 


Glycosyl transferases group 1 


2.9e-06 


27.0 


512 


Glycos trans 
f_l 


Giycosyl transferases group 1 


1.9e-09 


38.5 


514 


pro_isoineras 

. J 


cyclophilin type pep t idyl - 
prolyl cis-tr 


l,8e-63 


221.4 
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NO : 


PFAM NAME 


DESCRIPTION 

- ■ 


p-value 


PFAM 
SCORE 


SIS 




EGF-iifce domain 


1.9e-i8 


74 . 7 


516 


Surp 


Surp module 


4.3e-38 


140 .0 


523 


19 


Immunoglobulin domain 


3 .3e-06 


2S.0 


526 


UBX 


UBX domain 


i.le-34 


128 .6 


52 8 


a.clh_!2xnc 


2iiic*binding ctehydrogeoases 


2.7e-34 


127,4 


530 


SAM 


SAM domain (Sterile alpha 
mot if) 


0.046 


10.0 


531 


adh_short 


short chain dehydrogenase 


0.0025 


-34.1 


532 


mi to carr 


Mitochondrial carrier proteins 


2.5e-8I- 


281.7 


533 


mito_carr 


Mitochondrial carrier proteins 


2e-61 


213,5 


534 


thiolase 


Thiolase 


3.Se-lS3 


622 .0 


535 


FMO-like 


Flavin-binding monooxygenase- 
like 


0 


11S3 ,7 


536 


SCAN 


SCAN domain 


4e-55 


196.6 


53 7 




tRWA synthetases class I (I, h, 
M and V) 


3,le-l36 


4^6.0 


53 8 


tRNA-synt_l 


tRNA synthetases class I <I, I., 
M and V) 


3.1e-136 


466 .0 


539 


tRWA-synt_l 


tRNA synthetases class I (I, L, 
M and V) 


1.9e-117 


403 ,6 


540 


tRNA- syntax 


tRNA synthetases class I (I, h, 
M and V) 


3.1e-136 


466,0 


541 


vATP-synt_S 


ATP synthase {E/31 kDa) subunit 


5.9e-65 


295,7 "~ 


543 


zf-C2H2 


Zinc finger* C2H2 type 


5.5e-69 


242,^ 


544 


DUFlOl 


Protein of unknown function 
DUFlOl 


8.5e-38 


13 9.0 


545 


TGFbjpropept 
ide 


TGF-beta propeptide 


l.le-67 


238.2 


547 


WD40 


WD domain, G-beta repeat 


2.6e-32 


120.8 


54 8 


RHD 


Rel homology domain iRH0) . 


. l,6e-238 


686 .2 


549 


MMR HSRl 


GTPase of unknown function 


S.4e-67 


236 .0 


551 


HECT 


HECT-domain (ubiguitin- 
transf erase) . 


4.3e-127 


43S.6 


554 


MHC_II_alpha 


Class II histocompatibility 
antigen, alp 


3 .5e-74 


259 .8 


S5S 


zf -UBRl 


Putative zinc finger in N- 
recognin 


3.3e-16 


67.3 


ODD 


Kelch 


Kelch motif 


5.5&~2S 


109.7 


561 


BMP -binding 


AMP "binding enzyme 


2.8e-06 


-163.7 


562 


PABP 


Poly- adenylate binding protein ^ 
unigue domai 


4 .9e-38 


139.8 


564 


Gag_jp30 


Gag P30 core ahell protein 


1.2e-67 


238 .2 


566 




PWWP domain 


8 .le-16 


66.0 


567 


SCAN 


SCAN domain 


7.3e-68 


238.9 


569 


pkinase 


Eukaryotic protein kinase 
doma X n 


1.5e-84 


294.3 


" 570 


oici na 


jsuKaryocxc procexn kxnase 
domain 


l.Se-84 


294 .3 


571 


CN hydrolase 


Carh>On-nitrocr©n hvfi'rr%l a 


n n n n Q 1 


-79.7 


572 


myosin head 


Mvosin h@ad imntn'^ i^<^ma ^ n 1 


0 


1495 . 2 


573 


myosin head 




0 


14 90 ,4 


575 


Surp 


Surp module 


1 . 7e-23 


91 . S 


576 


Surp 


Surp module 


1.7e-23 


91 . 5 


577 


DNA_pol B 


DNA polymerase family B 


0 


1138.6 


578 


pbz 


PDZ domain (Also known as DHR 
or GLGF) . 


8 .3e-09 


42,7 


579 


LiRR 


Leucine Rich Repeat 


4 .9e-21 


83,3 


580 


neur_chan 


Neurotransmitter-gated ion- 
channel 


5.9e-177 


601.3 


583 


sushi 


Sushi domain (sCR repeat) 


0 


1673 .0 


584 


DEAD 


DEAD/DEAH bojc heiicase 


7.3e-36 


116.3 


5B6 


KH- domain 


KH domain 


2,9e-13 


57 .5 


567 


G-patcn 


G-patch domain 


2.3e-14 


61.2 


589 


LIM 


LIM domain containing proteins 


2.3e-36 


133.4 


590 


bromodomain 


Bromodomain 


6 .6e-32 


114.7 


591 


broTOodomain 


Bromodomain 


6,6e-32 


114.7 
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592 


honnone_rGc 


Ligand- binding domain of 
nuclear hormone 


3 .5e-22 


87.1 


593 


PHD 


PHD- Cingesr 


3 , 8e-12 


53 . 8 




cadherin 


Cadherin domain 


4 -2e-99 


342 .7 


596 


pkinase 


Eokaryotic protein kinase 
doma in 


5e-92 


319.2 


597 


WD40 


WD domain, G-beta repeat 


0 . 00054 


26 , 7 


600 


FG - GAP 


FG-GAP repeat 


4 .3e-7S 


262 . 9 


602 


G_Adapt_CT 


Gamma -adapt in, C- terminus 


l.le-53 


191,8 


603 


pkinase 


Eukaryotic protein kinase 
domain 


2 .3e-86 


300.4 


605 


Collagen 


Collagen triple helix repeat 
(20 copies) 


8e-42 


152.4 


606 


mito^carr 


Mitochondrial carrier proteins 


6.3e-67 


232.3 . 


608 


PWWP 


PWWP domain 


2.6e-28 


107,5 


609 


PWWP 


PWWP domain 


2.6e-28 


107,5 


613 


CAP_GLY 


CAP-Gly domain 


0.0046 


20.1 


615 


RPX_DNA_bind 
ing 


RFX pNA- binding dorrain 


S,2e-54 


192.9 


616 


kinesin 


Kinesin motor domain 


l.le-81 


284.8 


617 


kinesin 


Kinesin motor donvair. 


8 .4e-80 


278.5 


618 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


0,0098 


13-1 


620 


MATH 


MATH domain 


7.8e-0S 


22.2 


621 


y_jphosphatas 
e 


Protein- tyrosine phosphatase 


1.4e-32 


121.6 


622 


pkinase 


Eukaryotic protein kinase 
domain 


4 .4e-40 


146.6 


623 


BKR 


BNR repeat 


2.le-li 


51.3 


624 


molybdopteri 
n 


Prokaryotic nolybdopterin 
oxidoreductas 


1.4e-12 


42.2 


625 


TPR 


TPR iJbmain 


l.le-17 


72.2 


627 


cNMP_binding 


Cyclic nucleotidc-binding 
domain 


3 .7e-5e 


206.6 


630 


adh_short 


short chain dehydrogenase 


Se-17 


70.0 


631 


2f-C2H2 


Zinc finger, C2H2 type 


2.1e-88 


307.1 


632 


rrm 


RNA recognition motif. 


4e-05 


30.5 


635 


pkinase 


Eukaryotic protein kinase 
domain 


l,6e-104 


360.7 


636 


ForJc_he5d 


Fork head domain 


5.9e-27 


103.0 


637 


pkinase 


Eukaryotic protein kinase 
domain 


3..8e-70 


246,5 


642 


TPR 


TPR Domain 


4 .8e-08 


40,1 


643 


ef hand 


EF hand 


1.9e-27 


104 .6 


647 




SNF2i and others N- terminal 
domain 


1.2e-10a 


351 .1 


648 ~ 


h 2 


RNA pseudouridylate synthase 


1.9e-55 


197 , 6 


650 


zf -C2H2 


Axiii_ jLin^cr , v^zxi^ cype 


w . UUO / 


22 . 7 


651 


ank 




± . je— X / 


71 . 9 


652 


I LWEQ 




9 , 5e-101 


341.0 


653 


neux chan 


channel 


4 . le-171 


581.8 


654 


tsp__l 


Thrombospondin type i domain 


4.1e-47 


169.9 


659 


FH2 


Formin Homology 2 Domain 


ie-107 


371.2 


661 


pou 


POU domain - N- terminal to 
homeobox domain 


S.3e-45 


162.9 


662 


C2 


C2 domain 


6.76-19 


76.2 


663 


C2 


C2 domain 


6.7e-19 


76,2 


664 


C2 


C2 domain 


6.7e-19 


76.2 


6^7 


GST 


Glutathione S-transf erases - 


9,3e-34 


114 .4 


668 


LRR 


Leucine Rich Repeat 


9.3e-3a 


115.6 


670 


spectrin 


Spectrin repeat 


4e-57 


203.2 


671 


I_LWEQ 


I/IiWEQ domain 


9.5e-101 


341.0 


672 


ABC tran 


ABC transporter 


S.3e-60 


212.8 


674 


WD40 


WD domain, G-beta repeat 


4.8e-24 


93.3 
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675 




WD domain, G-i^eca repeat 


4 .86-24 


93.3 


676 




Leucine Rich Repeat 


o.oois 


25.2 


679 




Zinc i^inger C-x3-C-xS-C-x3-H 
type 


2.6e-29 


107.7 


680 




Zinc finger, C2H2 type 


5.2e-05 


30.1 


681 


■ CH~~ ' ' — 


Calponln homology (CH) domain 


2 .4e-17 


71.1 


682 




Dual specificity phosphatase, 
catalytic dotna 


4.3e-43 


156 . 6 


683 




Zinc finger, C3HC4 type (RING 
finger) 


0,051 


10.8 


687 




Sy naps In 


0 


1B90.8 


689 


PR5S 


Protein phosphatase 2A 
regulatory subunlt PR 


0 


1038.8 


691 


homeobox 


Homeobox domain 


8.5e-30 


112,4 


696 


Peptidase_M2 
4 


metallopeptidase family M24 


2.66-59 


"210.5 


697 




RhoGEF domain 


9.5e-35 


128.9 


698 


PHD 


PHD- finger 


0.008 


9.3 


701 


2i:-C2H2 


Zinc finger, C2H2 type 


5.56-123 


422,0 


702 


Sulfatdse 


Sulfatase 


3e-231 


781,6 


703 


K±-C2H2 


Zinc finger, C2H2 type 


5.7e-20 


79.8 


707 


Acyl^^transt 


Aoyl transferase domain. 


l.le-22 


88.6 


708 


WD4 0 


WD domain, G-beta repeat 


4 ,8e-i9 


76,7 


710 


Ran„BPl 


RanBPi domain* 


8.4e-06 


-7.3 


713 


DEAD 


DKAD/DEAH box heiicaie 


9.9©-42 


134.9 


714 


PH 


PH domain 


1.6e-09 


39.0 


715 


DSPc 


Dual specificity phosphatase, 
catalytic doma 


1.5e-37 


138.2 


717 


Sialyltranaf 


sialyltransferase family 


7.5e-31 


115.9 


718 




Immunoglobulin domain 


le-29 


100.8 


719 


inte3rxn_B 


Integrins, beta chain 


0 


1125.4 


720 


zt -C3HC4 


2inc finger, C3HC4 type (RING 
finger) 


l.le-08 


32.4 


722 


Peptidase__C2 


Calpain family cysteine 
protease 


3e-14S 


495.9 


723 


^9 


immunoglobulin domain 


2.2e-0S 


22.4 


724 


P— bo>c 


F-box domain. 


0.007 


23.0 


725 


liap ~ 


Putative snoRNA binding domain 


8 , le-S8 


205 . 5 ' 


726 


Nop 


Putative snoRNA binding domain 


8 .le-58 


265.5 


727 


WD40 


WD domain, 0-beta repeat 


7,5e-26 


99.3 


730 




Double-etranded RMA binding 
motif 


0.027 


12.1 


731 




Dynamin family 


4 .2e-l6 


66.9 


733 


z£-CCCH 


Zinc finger C-x8-c-x5-c-x3-H 
type 


2.8e-10 


41,7 


73S 


CDP- 

OH_P_transf 


v.uf'axconojL 

phosphatidyl transferase 


4 ,2e-26 


100.1 


738 


DEAD 


UKtJnuj i>ci/vn Dox, nci lease 


8 .6e-57 


182.5 


739 


TSC22 




6 . 5e-32 


1X9,5 


742 


ras 


Ras family '"' ' 


2 .2e-10O 


346.9 


743 
747 


PMl_typeI 
trypsin 


^■***'"^»*v«'niciA»tiw»a<— J.s>t/iwt3ira3c type x 


1 ,2e-243 


822.9 


"748 


Kazal 


Kazal-type serine protease 
inhibitor domain 


6 .4e-88 
2.2e-52 


279.4 
187 ,4 


749 
751 


PHD 


iSF hand 
PHD-f inger 


6.3e-06 
4.9e-l6 


33.1 
66.7 


752 
753 


2t-C2H2 
Hydrolase 


Zinc finger, C2H2 type 
haloacid dehalogenase-like 
hydrolase 


3.2e-21 
6.1e-ll 


83 .9 
49.8 


754 


Ribosomal r»3 
9 


Riboaomal L39 protein 


O.OOOIB 


26.7 


755 


PH 


PH domain 


3.6e-14 


55,7 


758 


SCAN 


SCAN domain 


1.4e-53 


191.5 


759 


PA 


PA domain 


0.0065 


23.1 


760 


arf 


fUJP-ribosylation factor family 


2.2e-l9 


77.8 


761 ( 


::iDE-N 


CXDE-N domain " - 


2-2e-40 


147.6 



i 
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SEQ ID 
NO : 


PFAM NAME 


DESCRIPTION 


p- value 


PFAM 
SCORE 


762 


hi St one 


Core histone H2A/H2B/H3/H4 


9.9e-S3 


188.6 


763 


zf -MYND 


MYND finger 


4.1e-14 


60.3 


764 


pou 


Pou domain - N- terminal to 
hotneobox domain 


le-52 


188 .6 


767 


vwc 


von Willebrand factor type C 
domain 


2 . 9e-34 


127.3 


769 


ef hand 


EF hand 


4 . 8e-ll 


SO.X 


770 


zf-C4 


Zinc finger, C4 type (two 
domains ) 


2 .4e-53 


181,6 


772 


ras 


Has family 


7e-90 


312.0 


773 


Sulfatase 


Sulfatase 


le-142 


487.5 


775 


zf -C2H2 


Zinc finger, C2H2 type 


1 .le-12 


55.5 


776 


zf-C2H2 


Zinc finger, C2H2 type 


l.le-12 


55. S 


777 


zf-C2H2 


Zinc finger, C2H2 type 


1 .le-12 


55.5 


778 


imn 


RNA recognition motif. 


2 .le-32 


121 . 1 


779 


G6PD 


Glucose-6-phosphate 
dehydrogenase 


1.5e-76 


236.6 


; 780 


spectrin 


Spectrin repeat 


3 ,7e-29 


110.3 


781 


tnito^carr 


Mitochondrial carrier proteins 


4 ,6e-57 


198.5 


782 


SCAN 


SCAN domain 


1.3e-24 


95.2 


783 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) , 


4 .le-07 


37.1 


785 


DEAD 


DEAD/DBAH box helicase 


6e-Q6 


21.7 


786 




Ras family 


5 .3e-39 


143.0 


787 


RNaee HII 


Ribonuclcase HII 


2.Se-67 


237.1 


790 


PI3„PI4__kitia 
se 


Phosphatidyl inositol 3- and 4- 
kinases ' 


S.4e-108 


372.2 


795 


cadherin 


Cadherin domain 


2 .5e-40 


147.4 


796 


ARID 


ARID DMA binding domain 


1.6e-20 


81.6 


797 


trypsin 


Trypsin 


9,9e-20 


64.8 


799 


CH 


Calponin homology (CH) domain 


3 .7e-15 


63.8 


801 


Gal- 

bin<i_lectin 


Vertebrate galactoside-binding 
lectin 


4 .le-25 


88.7 


803 


WD40 


WD domain, G-beta repeat 


0.00082 


26.1 


806 


TBC 


TBC domain 


1.8e-26 


101.4 


807 


TBC 


TBC domain 


r.8e-26 


101.4 


808 


CN_hydrolase 


Carbon-nitrogen hydrolase 


B.8e-80 


"■^^6.5 


811 


F " 


Hi St one -like transcription 
factor 


€e-14 


59.8 


812 


adh_short 


short chain dehydrogenase 


8.1e-20 


79.3 


814 


IMP4 


Domain of unknown function 


3,3e-71 


250.0 


815 


zf-C2H2 


Zinc finger, C2H2 type 


8.2e-66 


232.1 


616 


Pept_tRNA_hy 
dro 


Peptidyl-tRWA hydrolase 


1.6e-37 


138.0 


817 


ARID 


ARID DNA binding domain 


2.5e-18 


74.3 


826 


IFS__eIF4_eIF 
2 


eIF4"gaimna/elF5/eIF2-epsllon 


1.6e-32 


121.5 


830 


ArfGap 


Putative GTP-ase activating 
protein for Arf 


l.Se-53 


191.3 


831 




Leucine Rich Repeat 


2,ie-26 


101.1 


832 


Xaminin^EGF 


Laminin EGF-like {Domains III 
and V) 


2e-57 


204 .2 


839 




RNA recognition motiJ^* 


1 ,36-22 


88 . 5 


840 


Y_phosphata8 
e 


Protein-tyrosinc phosphatase 


2.6e-119 


40918 


841 


pJcinase 


Eukaryotic protein kinase 
domain 


3.4e-100 


346.3 


844 


Ribosomal L2 
2e 


Ribosomal L22e protein family 


le-^4 


228 .4 


846 


IBR 


IBR domain 


9e-15 


62.5 


849 


zt-C3HC4 


Zinc fxnger, C3HC4 type (RING, 
finger) 


7,4e-07 


26.5 


850 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


0.00016 


18.9 


8S1 


SET 


SET domain 


5e-30 


113 ,2 


852 


SRCR 


Scavenger receptor cysteine - 


0 


1025.4 
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SEQ ID 
NO: 




DES CR I PT I ON 


p-value 


PFAM 
SCORE 






, — r— 5 . 






653 


SRCR 


Scavenger receptor cysteine- 


0 


1025,4 


857 


iactamase^B 


Mecallo-beta- lactamase 
5 upe ju I, Clin jl xy 


0.012 


-6,0 


858 


COiCfeA 


Cyt'^'Chrome c oxidase subunit 
Via 


3,4e-58 


206 .7 


859 


rrtn ~* 




5*4e-45 


162.9 


86X 


PRK 


Phosphoribulokinase 


b\le-62 


219.4 


663 




Mitochondrial carrier proteins 


:?.9e-53 


185.5 


364 


HSP90 ~" 


Hsp90 protein 


4.7e-lS8 


538. 5 


866 


let 

-•■3 


Immunoglobulin domain 


4e-12 


44.1 


867 


Zf-C2H2 


Zinc finger, C2H2 type 


76-135 


461.5 


872 


iiistolie 


Core liistone H2A/H2B/'H3yH4 


4,9e-41 


149,8 


874 


cpSase_L_cha 


Carbamoyl -phosphate synt hase 
(CPSace) 


2,le-218 


739.0 




Ribosomal__Sl 
2e 


Ribosomal protein S12e 


2.1e-98 


340.3 




aerpin 


Serpins (serine protease 
inhibitors) 


2.5e-42 


14S.7 


883 


Patatln 


Patatxn 


1.2C-51 


182.0 


884 


KA 


Ras association (RaiGDS/AF-6) 
domain 


0.044 


8.0 


887 


i>UF92 


Integral membrane protein DUF92 


2,7e«12 


54.3 


889 




Sugar (and other) transporter 


"8.26-63 


222.1 


893 




Domain of unknown function 
DUF26 


1.3e-43 


158.3 


896 


IP_tratis 


Phosphatidyl inositol transfer 
protein 


6.56-98 


338.7 


898 


DEAD 




XJtiHu/UKAH cox helicase 


i,5e-48 


15^. 5 


899 




KE2 i^amily protein 


7e-61 


215. 7 


900 


KE2 


KE2 family protein 


4.3e-5l 


183.2 


901 


zf-C2H2 


Zinc finger, C2H2 type 


2.7e-57 


203.8 


902 


ras 


Ras family 


2.3e-75 


263 .8 


904 




TPR Domain 


3.2e-22 


87.2 


906 


GBP 


Guanylate-bxnding protein 


8.9e-253 


853 ,1 


907 


GBP 


Guanylate»-binding protein 


i.le-239 


809.6 


908 


Wr>4 0 


WD domain, G-beta repeat 


2.6e-26 


100.8 


909 


I^H 


PH domain 


i,3e-09 


39.4 


910 


2£-C2H2 


Zinc finger, C2H2 type 


2.5e-39 


144.1 


913 


En im^v*A fitf^ 

J» Utt^JL O \3 


NAD dependent 

epimerase /dehydratase family 


5e-07 


-88.5 


921 


TBC 


io\j uomain 


l.Se-09 


30.7 


922 


WD40 


nLi tMJtn&xn f tj— oeca repeat 


l.6e*-25 


98.2 


923 


WD40 


WD domain, G-beta repeat 


8.2e-07 


36.1 


924 


Kydrol £i s e 


haloacid dehalogenase-like 
hvdrolasf* 


2.9e-05 


29.1 


925 


UQ con 




0 . 00033 


-27.6 


925 


CH 


caiponxn homology (CH) domain 


3.3e-53 


190.2 


928 


WD40 


ut^MKAAiif ut~oct,ct fccpeac 


5.9e-48 


172.7 


929 


zf-C3HC4 


iiHik^ iiuger, ujjiL* cype (KING 

finger) 


3 .le-10 


37-4 


93 0 


Ribul_P_3_ep 
im 


kibulose -phosphate 3 epimerase 
family 


7 .2e-105 


3 61 . 8 


931 


Kibul P 3 ep 
im 


Ribulose-phosphate 3 epimerase 
family 


1.2e-96 


334.4 


936 


^2 


C2 domain 


2.2e-62 


220.7 


937 


NAP__family 


wucleosome assembly protein 
(NAP) 


l.le-ii - 


84.6 


940 


abhydrolaee 


alpha/beta hydrolase fold 


O.OIl 


3.1 


944 


Tropomyosin 


Tropomyosins 


3 .2e-07 


25.1 


948 


pkinase 


EuJcaryotic protein kinase 
domain 


3 .4e-75 


263.2 


949 


WD40 


WD domain, G-beta repeat 


a.8e-27 


104.7 


950 


Acyltrar.afer 


Acyl trans f erase 


1.6e-07 


38.4 
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SEQ ID 
NO: 


PPAM NAME 


DESCRIPTIOH 


P"" value 


PFAM 
SCORE 


951 


SAM 


SAM domain (Sterile alpha 
mot i f ) 


0.014 




9S4 


GFO IDH MocA 


Oxxdoreductase faraiiy 




52 , 0" ' 


955 
956 


BTB 
BTB 


BTB/POZ domain 

BTB/P02 domain ^ 


7e-22 


86.1 


957 


CDP- 


CDP- alcohol 

phosphatidyl transferase 


7e-22 
0,053 


86 . 1 

-22.2 


959 


ras 


Raa family 


2,4e-97 


336.8 


960 


ras 


Ras family 


8.4e-43 


155.6 


961 


AcetyXtransf 


Acetyl transferase (GNAT) family 


1 . 2e- 08 


42 . 2 


962 


adh_short 


short chain dehydrogenase 


2 . 4e-3l 


XX / . 0 


963 


mutT 


Bacterial mutT protein — — 


5.6e-06 


26,2 


969 . 


IF-2B 


Initiation factor 2 subunit 
family 


8,4e-193 


653,9 


970 


RNase PH 


3 • exoribonuc lease family 


9e— 24 


92 . 4 


975 


WW 


WW domain 


5.7e-2S 


96.4 


977 


PD2 ■ ■ ~ 


PD2t domain (AIqci Wnr»trf« ae niro 
or GLGF) . 


3 .6e-21 


83 ,7 


978 


Ribosomal Ll 
7 


Ribosomal protein L17 ~" " 


2 .4e-20 


ei,o 


979 




liIM domain containing proteins 


5,8e-42 


152,8 


980 


Calaequeetri 
n 


Calsequestrin * 


1.7e-297 


1001.7 


982 


HSP20 


Hsp20 /alpha crystaiiin family 


1 ,2e-10 


4S,i ~ 


983 
988 


bxidored q6 
TBC 


WADH ubiquinone oxxdoreductase, 
2 0 Kd sub 
TBC domain 


4 .8^-63 


222,9 . * 


989 


TBC 




2.2e-50 
2 . 2e-50 


180.8 
180.8 


993 


tRNA_xiit_end~ 
o 


tRNA intron endonuclease 


0.00X7 


-34.2 


994 


homeobox 


Homeolbox clomain ~" 


4e- 18 


73 .6 


997 
1000 


pyr__re<iox 
mito carr 


"pyridine nucleotide-di sulphide 
oxidoreducfca 

Mitochondrial carri&r proteins "' 


0*012 


11.6 


1001 


KA 


Ras association (RalGDS/AF-6) 
domain 


3 . /e-i^3 

J. . ^e-XD 


421 .2 
o5 ,4 


1004 
1005 


b^FBX 
actin 


Domain of unknown function 

DUF81 

Actin 


0 , 099 
1 .3e-174 


574-3 


1006 
1007 


actin 
cpn60_TCPl 


Act in 

TCP-i/cpn60 chaperonin family 


3 .le-130 
3 . 7e-195 


428.6 
661 , 8 ~ 


1008 
1009 


TjPR 

s:f-C2H2 


TPR Domain 

Zxnc fxnger, C2H2 type - 


8 .le-44 


159,0 


1011 


z£_C2H2 


Zinc finger, C2H2 type 


3 .6e-61 
3.6e-61 


216.6 
216.6 


1012 


■zl:-C3HC4 


zxnc finger, C3HC4 type ^RING 
finger) 


4.7e-l£ 


53.1 


1016 


tRNA-synt 2c 


tRNA synthetases class II (A) 


2.3e-lS 


55.2 


1016 


RhoGAP " ■ 


RhoGAP domain 


1.6e-78 


274.3 


1022 


PGAM 


Phosphogiycerate tnutase £amiiy 


3.8e-i8 


69,7 


1026 


TiMG box *" ' 


HMG (high, mobility group) box 


8.4e-20 


79.2 


1027 


TBC 


TBC domain 


7.3e-45 


162.5 


1028 


UQ con 

i?bz 


Ubxqui tin- conjugating enzyme 


1.4e-49 


178.1 ^ 


1032 




PD2 domain (Also Jcnown as DHR 
or . 


O.028 


Tr.3 


1034 
1037 


Hydrolase 
KRAB 


haloacid dehalogenasc-like 

hydrolase 

KRAB box 


2e-21 


64,6 


1038 


Caclon_efilu 

K 


Cation eitiux family " 


4. Be- 06 
V.ie-42 


32.4 
1S2.S 


1040 

1042 
1043 


5t-C2H2 


MAD:arginine ADP- 

rxbosyltreuiaf erase 

WD domain, G-beta repeat ' 


4.7e-47 
1.9e-18 


169.1 
74 . 7 


1045 

1046 ( 


Lectin^c ] 
glucosamine ( 

LEO 1 


2xnc finger, C2H2 type 
Lectin C-type domain 
glucosamine - 6 -phosphate 
Lsomerase 


3.7e-24 
1.3e-28 
0.00013 


93 . 7 

108.0 

-25.1 
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SEQ ID 
NO: 
1047 
1049 


PFAM NAJ-IE 
iigase-CoA 


DESCRIPTION ~ ~ 

CoA-iigases 


p-value 
4 .5e-ao 


PFAM 

SCX>RE 

i79.4 


1050 

1054 
1055 


Kibosomal L2 
4e ^ 

rrm 


Immunoglobulin domain ' ' 
Ribosomai protein L24e 

Amidase 

RNA recognition motif. 


1 .7e-09 
2e-33 

4 .3e-lS2 


35.6 
124.5 

518.7 


1058 

loss 

1060 


annexin 

PMP22_caaudi" 

n 


Annexin 

PMP-22/EMPyMP20/'claudin family 
Homeobox domain 


3 ,8e-2€; 
6. 9e-44 
0.023 


~ "100.3 - 
iS9-2 
*23.6 


1062 

1064 
106S 
1066 
1071 
1072 
1074 


Acyltrans£er 
LRR 

GTPl OBG 

""Tod 

DENN 


Acyl transferase 

AMP -binding enzyme 
I>eucine Rich Repeat 

GTPl/OBG family 

Irrannnoglobulin dornain " 

PHD- finger " 

" DENN (AEX-3J domain ' 


3.2e-31 

o,ooo6S ■ 

6.6e-100 
3.3e-14 
4 , 8e-41 
8.4e-48 
6 , 8e-07 


117.2"^ 
10.5 

345.3 
60.6 
141,8 
""159.1- ■ 
36.3 


107S 
1077 
1078 
1079 
1007 
1093 

1094 


SCP 
OIjF 

mito carr 
WD4 0 
START 
JDSPC 


scp-llke extracellular protein 
Olfactomedin-iiJce domain 
Mitochondrial carrier proteins 
WD domain, G-i?eta repeat ~ 
START domain 

Dual specificity phosphatase, — 
catalytic doma 
Glutathione peroxidases 


8.3e-33 

4 . Ve-4a 

2.2e'6^ 

le-42 

6.2e-45 

1. 5e-48 

3,3e-€3 


121.5 
149 .8 
234.0 
149,3 

* "i RO "n 
xo^ • / 

174.7 

223 ,4 


1095 
1096 


DUF25 


Domain ol; unknown function 
DUF25 


9.6e-41 
2e-75 


148.8 
264.0 


1105 


.DUF2S 


Doma in of unknown f unc t i on 
DUF2S 


6e-75 


262.4 


1106 


wicroreducta 

ee 

PTE 


"Nitroreductase family 

Phosphotr lest erase family 


1 - 3e-13 


58 . 6 


1107 
1109 


DAGKc 
ra£f 


Diacylglycerol kinase catalytic 
domain 

Ras family " 


i.3e-179 
0,00049 


^10.1 
*19.6 


1115 
1116 


ArtGap 
HMG14 17 


Putative GTP-aae activating 

protein for Arf - 
HMG14 and HMG17 " 


1 . 3e-lS 
a .7e-47 

4 .4e-21 


[40.7 
168.7 

83.5 


1117 
1119 


HMG14 17 

FAA_nydroi.as 

e 


HMG14 and HMG17 

Fumarylacetoacetate (faa) 
hydrolase fam 


9 . 9e-12 
2e-a3 


52,4 
290.6 


1120 
1123 


abhydrolase 


E\ikaryotic protein kinase 
domain 

aipha/beta hydrolase folcT 


l,4e-94 


327.6 


1129 

1131 
1132 
1133 


pro_isomeiras 
e 

DnaJ 

WD40 


cyclophxiin type pep t idyl - 
prolyl cis-tr 
DnaJ domain 

WD domain. G-beCa repeat 


9 - 2e- 23 
2,2e-56 

1 . 6e~ 3 0 
1.3e-19 


89 . 0 
197.1 

114 , 9 
78.6 


1134 
1136 

1137 


WD40 

Adap con^j jgu 
b 


WD domain, G-beta repeat 
PH domain 

Adaptor complexes medium 
subunit family 


i.8e-15 

0,0015 

1.2e-25ff 


64.9 
17,8 
866.0 


1139' 


t> 

ras 


adaptor complexes medium ~" 
subunit family 
Ras family 


2.Se-209 


708,8 


1141 "j 
1152 " 


1 


Eukaryotic protein kinase 
domain 


1.5e-86 
9,4e-74 


301.0 
258.4 


11S3" : 
1155 1 


\cyl transfer j 
[RS ] 


\cyltransterase 

^TU domain (IRS-l.type) 


1.2e-05 
5,4e-SS 


29.9 
196.1 


1157 J 

r 1159 C 
1160 2 


^g J 

Asparaginase / 
2 

JMC^oxred c 
jt-ANl '^p 


Cmmunogloouiin domain 
^paraginase " j 

fMC oxidoreductases ' ~~ } 

^1-1 ike Zinc finger ( 


L.3e-31 : 
5.4e-72 : 

l.7e-142 ^ 
). 00021 ; 


106. 9 
2S2.3 

IS5.3 
J7.9 
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SEQ ID 
NO; 


pfam name 


DESCR T DTTOM ■ — ■ 


p- value 


PFAM 
SCORE 


ii63 


linker liisto 
ne 


•*■•*•**"-'—*• Ai-i-o^-une njt clitvA xiD i.oiiUA.xy 


3 -8e-14 


60 , 4 


1164 


DED 


Death eftector domain 


3 . 9e- 05 


30,5 


1165 


IRS 


PTB domain (IRS-1 type) 


2 .6e-43 


157 . 3 


1166 


IRS 


PTB domain ( irs- 1 type) 


2 . 66*- 43 


157 . 3 


1168 


SAM 


SAM domain (Sterile alpha 
motif) 


0.04 


10. 5 


1170 


abhydrolase 


alpha/beta hydrolase fold. 


0.096 


-7 - 5 


1174 


SAP 


SAP domain 


3 ,9e-10 


47 . 1 


1177 


PP2C 


Protein ohosnh;^ t'asFk or* 


5 , 3e- 31 


112 ,S 


1178 


WD4Q 


WD domain, G-beta repeat 


4.7e-35 


129.9 


1180 


EtS 




1 . 8e-09 


33 .3 


1181 


Collagen 


Collagen triple helix repeat 


0.00016 


24,7 


1182 


TCLl MTCPl 


TCLi/MTCPl family 


9.5e-S6 


198,6 


1184 




KasGSF dotna 3. n 


1.7e-88 


307.4 


1185 


initio cdiirir 


Mitochondrial carrier proteins 


1.5e-62 


217.3 


1187 


UPAR hY6 


u-PAR/Ly-6 domain 


0-0042 


15,6 


1188 


orn^OAP__Arg^ 


Pyridoxal- dependent 
decarboxylase 


6.2e-12B 


430.6 


1193 




Stathmin family 


1.8e-90 


314.0 


1194 


Stathmin 


Stathmin family 


1.8e-90 


314.0 


11S5 


Seel 


Seel family 


3.2e-183 


622.1 ^' 


1196 


pyir_redox 

„„.^ 


pyridine nucleotide - di sulphide 
oxidoreducta 


3.1e-32 


111.6 


1197 


a 

o 


Giycosyl transferase family 8 


l,2e-09 


45.5 


1202 




K4- channel tetramerisation 
domain 


0,022 


-16.8 


1203 


adh thrift" 


short chain dehydrccenase 


8 .3e-4S 


162 .3 


1206 


nietJiylti 


vLox&/^uy& metnyxtransr erase 


1.3e-121 


417.4 


1208 


7tni 3 




7 .2e-09 


29 .0 


1209 


ank 


Anfc repeat 


3 .9e-lS 


^63.7 


1210 


VATP- 
synt:_AC39 


ATP <S\ynt*Vija*ao /r'/jir"1Q^ oitWim'**^ "" 


2 . 5e-12S 


439.7 


1212 


a£-C2H2 


2inc finger, C2H2 type 


5.5e-17 


69.9 


1213 






3 ,2e-07 


37.4 


1219 


rrm 


RNA recognition motif. 


2.1e-40 


147.7 


1220 


DUF6 


^ti\,z:^i,cx^ niciibUx^aji^ pro t em JJUr 0 


0 - 015 


21.5 


1222 




SCAN domaxn 


1 .Se-71 


251.1 


1223 


G-ganuna 


GGIi domain ■ 


3 . 6e-36 


129.5 


1227 


catalase 


Catalase 


0 


1158 . 9 


1232 


PX 


PX domain 


2.2e-lS 


64.5 


1233 





PX domain 


2 - 2e-lS 


64 .5 


1236 


FCH 


Fes/CIP4 homology domain 


3.3e-09 


44.0 


1241 


Peptidase M2 
0 




2e-63 


224.1 


1243 


WW 


WW domain ' 


0 . 044 


17.9 


1247 


UPFOOO^ 


Metalloenzyme of unknown 
function TJPPOOOC 


6.3e-€l 


215.8 


1248 


Glycos trans 


clycosyl transferases 


4 . 5e-io 




46.9 


1249 


etftand 


EF hand 


4e-ii 


50.4 


1254 


UQ_con "~ 


Ubiquitin- conjugating enzyme 


2.1e-73 


257.3 


12S5 


ras 


Ras family 


2.2e-62 


220.7 


1256 


formyl traxis 
f 


Formyl transferase 


4 ,9e-30 


108.3 


1259 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


S.3e-13 


46.4 


1261 


DiHliolate re 
d 


Dihydrofolate reductase 


2.1e-69 


241.7 


1262 


G__gi\i_transp 
ept 


Gamma-glutamyltranspsptidase " 


l-8e-110 


380.4 


1263 


PAS 


PAS domain 


l,3e-08 


36.9 


1265 




Leucine Rich Repeat 


4 .2e-22 


86.9 
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SEQ ID 
NO: 






p- value 


PFAM 
SCORE 


126^ 


SCP 


SCP— like &5ct faCf 1 1 11 1 a T" ■ovf^t- n 


■ 

6e-2 9 


108 , 0 


1267 


K_tetra 


IC+ channd tc txain&ri sat ion 
domain 


2 . 8e — 27 


104 . 0 


1269 


ras 


Ras family 


1 3e— 85 


297 , 9 


1275 


z£-C3HC4 


finger) 


4 . 2e-10 


37 , 0 


1276 


a bhy dtr 0 1 a s e 


alpha/bftta liydsroXasc fold 


5 . 4e" 23 


89.8 


i277 


abhydro 1 a s e 


alpha /beta hydrolase fold 




83 . 1 


1279 


trypsin 


Trypsin 




132 . 0 


1280 


PBP 


Phospibat idyle thanoXanid.ne — 
bi.ndin^ protein 


1 . 3e— 13 


58 , 7 


128S 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


5.6e-14 


49.6 


1287 


ank 


Auk: i^finAat* 


1 . 7e-S2 


187 - 8 


"129? 


£n3 


Fibronectin type III domain 


0,026 


20.9 


1295 


GDP 


Guanylate -binding protein 


0 . 00026 


-70.0 


1296 


PMP22_Claudi 
— — — — 


PMP-22/EMP/MP20/Claudln JJamily 


6.9e-41 


149.3 


12 97 


Rxioclan6 S€t 


Rhodanese-liJce domain 


3 .2e-14 


€10.7 


1298 


IjIM 


LIM domain containing proteins 


5.8e-21 


79.1 


1301 


rnaseA 


Pancreatic ribonucleases 


4.9e-43 


145.2 


13 07 


mi to^carx 


Mitochondrial carrier proteins 


2,le-S3 


186.0 


1308 


WD40 


WD domain^ G-beta repeat 


1.6e-17 


71.6 


1310 


UPAR IiY6 


u-PAR/Ly-6 domain 


7.le-20 


75.5 


1313 


thiored 


Thioredoxin 


3 .6e-0S 


21.6 


1314 


Aa_trans 


Transmembrane amino acid 
transporter protein 


l.Se-67 


237.9 


13X6 


trypsin 


Trypsin 


4.4e-41 


132.0 


1320 


RibOGomsI_Ll 
3 


Ribosomal protein L13 


3.9e-6a 


219.8 


1327 


Armadil 1 o_s is; 
9 


Armadil lo/beta- catenin- 1 ike 
repeats 


0.0054 


23-4 


1328 


KRAB 


KRAB box 


0.052 


-5.6 


1329 




RNA recognition motif. 


2.1e-40 


14 7,7 ~ 


1330 




Apoptosis regulator proteins, 
JDCX"^ tamny 


0 .014 


-1.6 


1331 


PX 


trA oomajin 


2 . le-10 


48.0 


1333 


KRAB " 




1 . Be-36 


134 . 6 


1334 


UPP^syntheta 


Putative undeoaprenyl 


2.3e-83 


310.3 


1335 


UPP syntheta 
se 


dipbospha te synt 


1 , 8e-59 


211. 0 


1336 


DSPC 


Dual specificity phosphatase, 
catalytic doma 


1 . 2e-3i" "' 




1337 


DSPc 


Dual specd.£ici.ty phosphatase , 
catalytic doma 


2 . 3e-12 


S4 . 5 


1338 


TPR 


T?R Domain 


0 . 00021 


28.1 


1340 


metal thio 


Metallothionein 


0 - 013 


20 . 3 


1341 


mutT 


Bacterial mutT protein 


5 . 8e-09 


36 . 5 


1343 


Band 41 


PERM domain (Band 4,1 family) 


l,3e-38 


122 . 5 


1344 


Kelch 


Kelch motif 


1 . 4e-44 


161 , 5 


1345 


Antifreeze 


Antifreeze protein 


1.2e-10 


48. 8 


1347 


3Beta_HSD 


3 -beta hydroxys teroid 
dehydrogenase/isomer a 


0.086 


-177.2 


1348 


BTB 


BTB/poZ domain 


5.3e-28 


106.5 


1349 


DUF6 


Integral membrane protein DUF6 


0,033 


15.8 


1350 


inyosin_head 


Myosin head (motor domain) 


0 


1088.7 


13 52 


jsiramp 


Natural resietance-aseociaced 
macrophage pro 


1.2e-202 


686.6 


1353 


S_100 


S-lOO/ICaBP type calcium 
binding domain 


5.3e-23 


89,9 


1355 


DEAD 


DEAD/DEAH box helicase 


3 .6e-65 


209.0 


1356 




C2 domain 


2.4e-15 


64.4 


1357 


RBD 


Raf-like Ras -binding domain 


4.2e-57 


203 .1 


1360 


Zl:-C2H2 


Zxnc finger, C2H2 type 


7-4e-141 


481.4 


1361 


HMG14 17 


HMG14 and HMG17 


7.9e-40 


145.7 
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SEQ ID 
NO: 
1362 
1363 


PFAM NAME ■ 

SIS 
SIS 


DESCRIPTION 
SIS domain 

SIS domain — 


p-vaiue 

3.8e-30 


PFAM 
SCORE 
113.6 ~" 


1364 
1368 


K_tetra 


ImmunogloJoulm domain 

K+ channe 1 t e t rame r i s a t ion 

domain 


l.3e-28 
0. 00026 
i.le-16 


108.5 

19.0 

68.9 


13 71 

1372 
1376 
1378 
1380 


Collagen 

" DnaJ ■■ 
KRAB 
b;JUM2 
thiored 


Collagen triple helix repeat 
(20 copies) 
DnaJ domain 
KRAB box 
ELM2 domain 
Tbioredoxin 


^.2e-il3 

6.6e-36 
2 . le-38 
2e-23 
i.2e-23 


390.1 

132.7 
141 .0 
91.3 
82 .8 


1381 
1382 
1383 
1384 
1387 

1389 
1390 
1393 
1394 
1398 


ank 

BTB 

WD40 

WD40 

zf-C3HC4 

2f-C2H2 

" gii:'-C2H2 

Jcinesin 
zi-C2H2 
KRAB ~" 


Ank repeat 

BTB/POZ domain 

WD domain, G-beta repeat 

WD domain, G-beta repeat 

Zinc finger, C3HC4 type (RING 

finger) 

zinc finger, C2H2 type 

Zinc finger, C2H2 type 

Kincsin motor domain ~ " 

Zinc finger/ C2H2 type 

KRAB box — 


2.3e-a3 

3e-ll 

1.6e-19 

6,3e-24 

l.le-09 

5.5e-50 
2 . Se-as 
7.8e-a88 
1.26-49 
5.1e-22 


290.4 

50-8 

78.3 

92.9 

35.6 

179,5 
296.9 
637.4 
178.4 


1402 
1405 
1406 
1407 
1408- 
1409 

1410 


hZlP 
sugar_tr 
khoGAP 
rrm 

I.RR 

^ebulan repe 
at 

ank ~" 


b>ZXP transcription factor 

Sugar (and other) transporter 

RhoGAP domain 

RNA recognition motif. 

Leucine Rich Repeat™ ~~ 

Nelbulin repeat 

AnJc repeat — ™__ 


0 .035 

0.003 

8.9e-47 

le-36 

2,le-13 ■ 

6e-S4 


13.1 

-101.5 

168.6 

132.1 

58.0 

192.6 


1412 

141S 
1416 
1417 
1419 


Kiijosomal iiS 
_c 

"trypsin 
aminotran 1 

Si 

HD40 


ribosomai L5P family C- terminus 
Trypsin 

Aminotransferases class -I 
SI RNA binding domain 
WD domain, G-beta repeat: 


l.6e-i7 
a.2e-58 

4.7e-85 
4 .4e-0S 
1. 6e-C7 
i2.2e-09 


71.6 . 
205. S 

270 .4 
' -91.2 
33.1 
44 .6 




1422 
1424 
1425 
1426 
1427 


cadherin 

SH3 

PHD 
PHD 

ArfGap 


Cadberin domain 
SH3 domain 
PHD- finger 
PHO- finger 

Putative GTP-ase activating 
protein for Arf 


8 .3e-42 
2 .Se-80 
3 ,2b-17 
3 , 2e-17 
le-37 


152.3 
280.3 
70.6 
70.6 
13 8.8 




1428 

1429 
1430 
1431 


"EeTTcaie^C 
WD40 

inositol P 
mi to carr 


Heiicases conserved C-^terminal 
domain 

WD domain, G-beta repeat 
Inositol monophosphatase family " 
Mitochondrial carrier proteins 


le-26 

3.9e-07 
2,Se-10 " 


102.2 

37.2 
40.2 




1433 
1434 
1435 

143^ 
1438 
1440 
1441 
1443 
1446 
1447 

1448 i 
1451 ] 

1454 : 

1455 J 
1460 ; 
L461 C 
L470 1 
L472 E 


Ciq 

WD40 

Inos-1- 

P_synth 

rrm 

ig 

Q Adapt CT ~ 
G_Adapt CT 
Kelch 

Srid 

2t-C2H2 

'^P-bindihg 7 
erm ] 
Lg 

Sialyl transf 1 
Aldose epim ; 
:2 < 
^iG ] 
*seudoU_synt ~| 


Clq domain 

wu domain, e-beta repeat 

Myo-inositoi-i -phosphate 
synthase 

RNA recognition motif. 
Immunoglobulin domain ' " 
Gamma-adapt in, C- terminus 
uamma-aciaptxn, C-terminus 
Kelch moti£ — 
ARID DNA binding domain 
^inc finger, C2H2 type 
aWP-binding en2yme 

recognition motif, 
immunoglobulin domain 
:»lalyltransferase £a:i»ily 
Udose l-epimerase — - 
J2 domain 
tPT/TIG domain 

tKA pseuaouridylate synthase 4 


4 .3e-83 
2 .9e-16 
1 .66-13 
7e-228 

i.4e-34 

1.3e-12 

3 .4e-67 

3 -4e-67 

0.00013 

l.8e-2l 

9.4e-28 

2.c;e-07 ■■- 

6.Se-21 

5.6e-44 

5.4e-21 

l,9e-35 

ae-18 

3.1e-19 

1.3e-16 ( 


287,7 
66.2 
58.3 
770.4 

128.3 

45.6 

236.7 

236.7 

28.7 

84,7 

105.6 

-14^.1 

82.9 

146.7 

93 .2 

131.2 

73.6 

77.3 

56.9 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p- value 


PFAM 
SCORE 












1474 


DENN 


DENN (AEX-3) domain 


1.3e-44 


161.6 


147S 


Catioi3_ef f lu 

X 


Cation efflux family 


4.6e-4^ 


176.4 


1477 


TBC 


TBC domain 


86-47 


""169.0 


1478 


rrm 


RNA recognition motif. 


2e-21 


84 . 6 


1480 




Immunoglobulin domain 


5.5e-06 


24.3 


1484 


Telo_bin<i__al 
pha 


Telomere-J3inding protein alpha 
subuni 


0.028 


-225.9 


1485 


2f-C2H2 


Zinc finger, C2H2 type 


l.&e-68 


240,9 


1486 


pkinase 


Eukaryotxc protein kinase 
domain 


"9.56-13 


49.9 


1488 


helicase_C 


Helicaaes conserved C-texrminal 
domain 


1.4e-lS 


65.2 


1483 


DUF89 


Protein of unknown function 
DUF89 


0.079 


-132.4 


1490 


ECH 


Enoyl-CoA hydratase/isomerase 
family 


5.2e-41 


149. 7 


1491 


guaiiylat e_cy 
c 


Adenylate and Guanylate cyclase 
catalyt 


5.9B-46 


166 .1 


1492 


hKR 


Leucine Rich Repeat 


3 .4e-19 


77.2 ~~ 


1495 


2£-C3HC4 


Zinc finger, C3HC4 type {rikg 
finger) 


7.1e-10 


36.3 


1497 


pkinase 


EuJcaryotic protein kinase 
domain 


ie-22 


85.8 


1500 


SH3 


SH3 domain 


9.3e-0S 


27,2 


1502 


nomeobox 


Homeobox domain 


0.084 


13,8 


1503 


hotneobox 


Homeobox domain 


0. 084 


13.8 


1505 


EGF 


EGF- like domain " 


2.7e-23 


90.8 


1506 


UCH-2 


Ubiquitin carboxyl- terminal 
hydrolase family 


2 .7e-21 


84.2 


isoa 


Peptidase M2 
0 


Peptidase family M20/M2S/M40 


2 . ee-2e 


101.8 


ISll 


PX 


PX domain 


1 .9e-ll 


51.5 


1512 


Sulfatase 


Sulfatase 


2 . 8e-35 


130,7 


1516 


syntaxin 


Syntaxin 


0.011 


-62,3 


1518 


amino tran_3 


Aminotransf eraoea class- III 
pyridoxal -pho ^ 


9.7e-106 


305.6 


1520 


ig 


Immunoglobulin domain 


0.075 


11.0 


1521 




Ras association (RalGDS/AP-6) 
domain 


0 .0X3 


13. i 


1523 


RhoGAP 


RhoGAP domain 


2.Se-05 


18.7 


1528 


WD40 


W0 domain, G-beta repeat 


5.4e-24 


93.1 


1535 


IMS 


impB/mucB/samB family 


7.8e-9S 


328.5 


1538 


FTVE 


FYVB zinc finger 


3.2e-27 


101.5 


1539 


DAGKc 


Diacylglycerol kinase catalytic 
domain 


6e-07 


36.5 




Ocular alb 


Ocular albinism type 1 protein 


0 


1184.7 




onir 


SAP domain 


6e-06 


33.2 


1654 


Antino__oxl.Qa s 
e 


Flavin containing amine oxidase 


3.2e-43 


157.0 




Amino o?ci.oa£t 
e 


Flavin containing amine oxidase 


3.2e-43 


157.0 


1656 


RhoGEF 


RhoGEF domain 


1.4e-24 


95.1 


1657 


MMR HSRl 


GTPase of unknown function 


6,0011 


-45.5 


1659 




Ubiquitin carboxyl - terminal 
hydrolase family 


2.5e-ll 


51.1 


1660 


actin 


Actin 


6.6e-21 


63.9 


1661 


BAH 


BAH domain 


1.7e-82 


"287.5 


1662 


vwa 


von Willebrand factor type A 
domain 


0 


1909.4 


1663 


WD40 


WD domain, G-beta repeat 


1.4e-67 


237.9 


1667 


zt-C2H2 


Zinc finger, C2H2 type 


1.3e-93 


324,4 


1669 " 


Noi.l_Nop2_Su 

n 


N0Ll/NOP2/sun family 


1.3e-23 


84 . 3 


1671 


SH2 


Src homology domain 2 


S.4e-15 


46.9 



f ■ 
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SEQ ID 
NO : 


PiTAM NAME 


DESCRIPTION 


p-vaiue 


PFAM 
SCORE 


1672 


cnromo 


* chrome' (CHRromatin 
Organization Modifier) 


2 .le-18 


67.7 


1674 


zf-CCCH 


Zxnc txnger C-x8-C-xS-C~x3-H 
type 


0.002S 


17.6 


1676 


Giyco hydro 
47 


Glycoayi hydrolase family 4 7 " 


l-Se-187 


636.2 


1677 " 
1680 


Giyco^hydro 
47 

WD4 0 


Glycosyl hydrolase family 47 
WD domain, G-beta repeat 


4 - 5e-74 
l-le~27 


2S9.5 
10S.5 


1681 
1683 
1621 


WD40 

MMR_HSR1 
rrtn 


WD domain, G-beta repeat 
GTPase o£ unknown function 
RNA recognition motif. 


l.le-27 
1. 8e-78 
1- 8e-37 


105,5 
274.1 
137.9 


1692 
16S3 


rrm 
AAA 


RNA recognition moti£. 

ATPaaes aasociated with various 

cellular act 


l,8e-37 
1.3e-8i 


137 . 9 
284 . 5 


1697 


Perric^reduc 
t 


Ferric reductase like 
transmembreme com 


8 .4e-82 


285.2 


1698 

IS99 
1700 


Ferric reduc 
t 

zt-C2H2 


Ferric reductase like 
transmembrane com 
Zmc finger, C2H2 type 
ADP-ribosylation factor family '" 


3,Se-S3 

4.4e-34 
9e-19 


190.1 

126 . 6 
75.8 


1702 
1703 
1707 

1709 
1710 


" OTP EFTU " 
"SCAN ~~ 
pkinase 

WD40 


Elongation factor Xu family 
SCAN domain 

Eukaryotic protein kinase 
domain 

WD domain, G-beta repeat 
Tieucine Rich Repeat 


0 . 014 
1.8e-54 
1.2e-88 ~ 

0,0035 
l.iJe-3d " 


11 . 4 
194 .4 
307.3 

24.0 

H5 .3 ' 


1711 
1712 
1713 


WW 

zf-CCCH 


WW domain 
Ank repeat 

Zinc finger C-x8-C-x5-C-x3-H 
type 


7.6e-12 
4.2e-34 
2.6e-Q9 


52.8 

126.7 

38,3 


1714 

1:^16 
1718 


zf-CCCH 
ras 

H«G box 


Zinc finger C-x8-C«xS-G-x3-H 
type 

Ras family 

HMG (high tnobiiity group) box 


2,6e-09 
4 , 4e-41 


38.3 
X49,9 


1719 
1721 


TBC 


TBC domain " 

Hel XX- loop-helix DNA-binding 


8.3e-21 
l.ae-45 
9.2e-10 " 


§2*6 

165.2 

45.9 


1723 


dsrm ~ 


Double -stranded RNA binding 
motif 


2.9e-05 


30.5 


1724 

1725 
172S 


RrnaAD 

CIDE-N 
HAT 


Ribosomai RNA adenine ' 

dimethylases 

cit)E-N domain 

HAT (Half-A-TPRJ repeats 


0.045 
5. 9e-40 


9,2 
146.2 


1728 
1733 


efhand 

Hist aeacety 

1 


EP hand 

Histone deacetyiase family 


2.9e-44 
5.1e-20 
l-7e-104 


160.5 

79.9 

360.6 


1735 


LRR ' 


Leucine Rich Repeat 


■4,6e-34 


126.6 


1739 
1743 


PI-PI,C-X 
ras 


Phosphatidyiinositol-specific " 

phpsphoXipase 

Ras faraiTy 


0.0023 


16.1 


1744 
1745 
1746 
17S1 
1754 
1756 
17S8 

1760 ] 

1761 ] 
17^5 I 
1769 ( 
1775 c 
1779 < 

1783 I 

1784 f 


RasGEF 

adh_short 
2f-C2H2 

^ — 

2±-C2H2 

Hap 

MOp 

Am hSri 

^_hydrolase 
ink j 
3>cysterol_BP < 
ihoGEF " J 
OioGBF I 


Ras family 
RasGEF domain 
short chain dehydrogenase 
2inc finger, C2H2 type 
fribronectin type III domain 
Zanc finger, C2H2 type 
RNA recognition motif. 
Putative snoRNA binding domain 
Putative snoRNA binding domain 
orTPase of unknown function 
warbon- nitrogen hydrolase 
\nk repeat 

->xysteroi -binding protein 
^hoGEF domain 

?±ioGEF domain " " 


3.7e-10 

3.7e-10 

3.2e-49 

7 * le-08 

9e-.33 

S.5e-101 

fa.3e-S3 

0,017 

6.1e-95 

6.le-95 

6.4e-41 

3e-06 

4.1e-07 

4.7e-56 

1.6e-23 

1.6e-23 


-21,3 

-21.3 

176.9 

ol . o 

142.2 

348,9 

322.1 

21.2 

328.8 

3?e,8 

149.4 

-43.9 

37.1 

199.6 

91.6 " 

91.6 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


1785 


rrm 


RNA recognition motif. 


6 . 4e-14 


59.7 



TRADOCS; Ml 6227.1(%CRN0n.DOC) 



I (V. > 
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TABLE 5 





SEQ ID NO: 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS (MAXIMUM 
SCORE) 


SCORE) 


1 


1-21 


0 .991 


0.955 


2 


1-31 


0.995 


0 .944 


3 


1-33 


0.949 


0.736 


4 


1-19 


0.970 


0.951 


S 


1-26 


0.971 


0.863 


6 


1-26 


0.971 


0 . 863 


7 


1-26 


0 .971 


0 , 863 


8 


1-26 


0 .971 


0 . 863 " 


9 


i-4S 


0 .982 


0 . 901 


10 


1-21 


0-991 


0 , 955 


11 


1-23 


0,989 


0.699 


12 


1-25 


0.9SS 


0 . 603 


13 


1-18 


0.932 


0 , 62S 


14 


1-18 


0,938 


0 . 876 


15 


1-25 


0 . 941 


0 . 811 


16 


1-17 


0 . 972 


0 . 939 


17 


1-27 


0 • 964 


0 . 777 


18 


1-16 




0 ,657 


19 


1-19 


V • 73 J 


0 .840 


20 


1-20 




0 . 701 


21 


1-22 


0 « 974 


0 . 8S0 


22 


1-33 


0 . 961 


0 .895 


23 


1_X9 




0 ,959 


24 


i-31 




0 - 944 


2S 


1-22 


\j . if ft} 


0 , 935 




3,-27 


0,996 


0 .928 


27 


1-24 


0 . 953 ' 


0 -73 9 


28 


1-21 ~ 


0 . 906 


0*688 


29 


1-31 


0 . 986 


0 . 841 


30 


1-28 


0.S80 


0 , 893 


31 


1-19 


0 . 993 


0 . 976 


32 


1-22 


0, 998 


0 . 309 


35 


1-33 


0.949 






1-33 


0.949 


0 ,736 


46 


1-19 


O.S70 


0 . 951 


67 


1-25 


0.968 


0 .848 


71 


1-18 


0.949 


0 . 845 


72 


1-30 


0.991 


0 .919 - 


75 


1-29 


0.958 


0 .854 


88 


1-20 


0.986 


0 .94 5 




94 


1-33 


0.994 


0 .943 




- 97 


1-46 


0.964 


0.59S 




103 


1-49 


0.983 


O.570 




108 


1-26 


0.978 


0.885 




111 


1-23 


0.989 


0.899 




126 


1-25 


0.955 


0.803 




129 


1-19 


0.963 


0.918 




13B 


1-29 


0.971 


0.844 




143 


1-18 


0.514 


0.628 




148 


1-20 


0,969 


0.904 




156 


1-25 


0,941 


0.811 




158 


1-22 


0.S79 


0.927 




160 


1-17 


0.972 


0.939 




161 


1-46 


0.903 


0.571 




162 


1-25 


0.937 


0.729 




168 


1-16 


0.939 


0.826 




171 


1-27 


0.364 


0.777 




178 


1-21 ■ - 


0.945 


0.825 




180 


1-27 


0.981 


0.941 




187 


1-28 


0 . 982 


0,936 




190 


1-19 


0.953 


O.S40 




196 


1-22 


0.375 


0,916 




197 


1-22 


0 . 9^3 


0.936 
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SEQ ID NO: 


POSITION OF 


MaxS (MAXIMUM 


Means (MEAN 




SIGNAL IN AMINO 


SCORE) 


SCORE) 




ACID SEQUENCE 






199 


1-20 


0,935 


0 .701 


200 


1-23 


0 .977 


0.773 


206 


1-30 


0 .984 


0 .890 


207 


1-19 


0,990 


0-924 


208 


1-22 


0 .974 


0.850 


2X0 


1-40 


0.940 


0 .670 


211 


1-29 


0.971 


0 .849 


2X6 


1-24 


0.986 • 


0.956 


21ti 


1-33 


0.961 


0 ,895 


219 


1-19 


0.970 


0,871 


221 


1-19 


O-904 


0.553 


222 


1-21 


0.917 


0.555 


230 


1-19 


0.991 


0.959 


231 ■ 


1-26 


0,953 


0 .800 


232 


1-25 


0.988 


0 .826 


239 


1-23 


0.969 


0,826 


240 


1-17 


0.982 


0,955 


241 


1-17 


0.982 


0 .955 


24S 


1-30 


0.970 


0.722 


248 


1-22 


0.976 . 


0 .935 


249 


1-23 


0.966 


0.940 


552 


1-18 


0.971 


0,923 


261 


1-24 


0.883 


0.587 


265 


1-18 


0.939 


0 ,868 


272 


1-24 


0.953 


0 .739 


283 


1-21 


0.90^ 


0.688 


284 


1-29 


0.997 


0.854 


290 


1-31 


0.986 


0.841 


302 


1-28 


0,980 


0 .893 


304 


1-16 


0 ,907 


0.635 


3X2 


1-19 


0.993 


0.976 


313 


1-17 


0.930 


0.753 


323 


1-22 


0,998 


0.909 


324 


1-17 


0.982 


0.9S4 


328 


1-19 " 


0,971 


0.865 


329 


1-22 


0.963 r-V 


0.i^24 


330 


1-33 


0.978 


0 .841 


331 


1-24 


0-920 


0,712 


332 


1-24 


0 ,975 


0.881 


333 


1-19 


0.984 


0.941 


334 


1-20 


0.899 


0.567 


33S 


1-27 


0.942 


0.813 


336 


1-20 


0.952 


0.850 


337 


1-38 


0.942 


0.653 


338 


1-27 


0 .973 


0.772 


339 


1-36 


0.979 


0. 804 


340 


1-27 


0.886 


0.597 


343 


1-19 


0 . 971 


0.865 


344 


1-22 


0 ,994 


0.928 


345 


1-17 


0 .966 


0.687 


346 


1-19 


0 .936 


0 . 822 


347 




0 . 963" 


0 • 924 


349 


1-24 


0.982 


0.966 


351 


1-21 


0.918 


0,815 


352 


1-31 


0,988 


0.912 


354 


1-31 


0.974 


0.839 


i^B 


1-29 


0.932 


0.632 


3S6 


1-15 


0.994 


0,969 


357 


1-33 


0.935 


0.726 


360 


1-27 


0.938 


0.82-) 


3 61 


1-25 


0.954 


0,674 


362 


1-22 


0.929 


0-788 


3 63 


1-21 


0.681 


0.715 


364 


1-33 


0.978 


0 .841 


365 


1-33 


0.978 


0,841 
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SEQ ID NO: 


POSITION OP 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS (MAXIMUM 
SC0R3) 


MeanS (MEAN 
SCORE) 


366 


1-21 


0-916 


0,820 


367 


1-19 


0 .936 


0 .822 


368 


1-25 


0.972 


0-874 


370 


1-24 


0 .920 


0 .712 


371 


1-24 


0.961 


0.773 


3 72 


1-27 


0.919 


0.768 


373 


1-19 


0.986 


0-945 


375 


i-32 


0.994 


0.932 


376 


1-34 


0.987 


0.810 


377 


1-17 


0.995 


0,950 


378 


1-49 


0.971 


0.749 


380 


1-20 


0.968 


0.8 74 


381 


1-20 


0.928 


0.782 


382 


1-19 


0.986 


0 . 934 


383 


1-28 


0,965 


0 .829 


384 


1-39 


0 .970 


0 . 551 


386 


1-24 


0.975 


0 . 881 


388 


1-3 0 


0.989 


0.868 


389 


1-19 


0 ,984 


0 . 94 1 


390 


l-2€ 


0,971 


0 , 782 




1-20 


0 .981 


0 , 900 — - 


393 


1-16 


0 .968 


0.890 "" 


394 


1-23 


0 ,937 


0 . 701 


397 


1-22 


0.96S 


b . 8S4 ' ' 


399 


1-46 


0.977 


n KQit '"" "" 


401 


1-20 


0 .899 


u . ats / 


402 


1-22 


0.967 


0 , 93 1 " ' 


403 


1-27 


0 . 992 


0 . 93 4 


404 


1-19 


0 . 991 


0 . 973 


405 


1-23 


0.994 


0 , 921 ' " 


407 


1-35 


0.987 


0.658 


408 


1-39 


0 ,976 


0 . 551 '" "' 


409 


1-33 


0,897 


0 .570 


410 


1-25 


0.990 


0.962 — 


411 


1-38 


0.977 


0 , 827 


412 


1-20 


0.944 


0.768 


413 


1-20 


0,988 


0 .965 


414 


1-46 


0.993 


0.638 


415 


1-23 


0.981 


0.94 6 


4X7 


1-29 


0,941 


0.672 . 


418 


1-20 


0,952 


O.BSO 


419 


1-19 


0,986 


0.967 


420 


i-29 


0.965 


0.861 


421 


1-22 


6,889 


0.785 


422 


1-48 


0.982 


0.862 


424 


1-19 


0.979 


0.933 


428 


X-38 


0.942 


0.653 


430 


1-18 


0.947 


0.59S 


432 


1-33 


0,957 


0.789 


433 


1-26 


0.979 


0.904 


434 


1-27 


0.962 


0.777 


435 


1-24 


0.998 


0.977 


436 


1-27 


0.973 


0.772 


443 


1-15 


0 . 966 


0.940 


448 


1-36 


0.979 


0.804 


453 


i-<lX 


0.958 


0.609 


455 


i-33 " 


6.943 


0.606 


457 


1-27 


0.888 


0.597 


462 


1-16 


0.925 


0.681 


486 


X-27 


0.972 


0.B4S 


495 


x-;i4 


0,917 


0,€36 


498 


X-26 


0.993 


0.890 


505 


X-20 


0.976 


0,926 


507 


X-X7 


0.966 


0.687 


510 


i-2J 


D.930 


0.593 
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SEO Xt> NO: 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS (MAXIMUM 
SCORE) 


Means (MEAN 
SCORE) 


511 


1-23 


0-930 


0.593 


S12 


1-23 


0.930 


0.S93 


S15 


1-18 


0,978 


0.956 


523 


1-19 


0.936 


0.822 


529 


1-22 


0.963 


0-924 


S4S 


1-24 


0,982 


0.966 


S50 


1-30 


0.933 


0.713 


552 


1-21 


0.973 


0.912 


S54 


1-23 


0.969 


0.784 


571 


1-21 


0.918 


0.815 


574 


i-31 


0,988 


0, 912 


580 


1-39 


0.925 


0,556 


594 


1-31 


0,974 


0.839 


608 


1-29 


0.932 


0,632 


609 


1-29 


0 ,932 


0,632 


€10 


1-21 


0.990 


0.948 


621 


1-15 


0-994 


0,969 


623 


1-33 


0.93S 


0-726 


653 


1-27 


0.936 


0.827 


668 


1-^22 


0,929 


0,788 


677 


1-16 


0,948 


0.807 


685 


1-21 


0.881 


0.715 


699 


1-22 


0.975 


0.816 


702 


1-31 


0.968 


0.898 


707 


1-16 


0,850 


0.562 


713 ' 


1-25 


0,966 


0.743 


71B 


1-19 


0.936 


0.822 


719 


1-20 


0.961 


0.824 


729 


1-29 


0.972 


0. 874 


735 


1-46 


0,903 


0 .598 


746 


1-14 


0,926 


0.730 


747 


1-22 


0.965 


0.876 


748 


1-29 


0,968 


0.785 


759 


1-24 


0.961 


0,773 


767 


1-27 


0,919 


6.768 


768 


1-33 




0.585 


773 


1-42 


0.959 


0.702 


779 


1-19 


0-986 


0,945 


797 


1-19 


0,944 


0.759 


798 


1-19 


0,900 


0.^68 


820 


1-17 


0,995 


0.950 


827 


1-49 


0.971 


0 . 749 


848 


1-20 


0.968 


0.874 


864 


1-20 


0.928 


0.782 


866 


1-19 


0.986 


\ 0-934 


B73 


1-23 


0.948 


0 . 866 


861 


1-28 


0.965 


0.829 


887 


1-39 


0.970 


0.551 


927 


1-30 


0,989 


0 ,868 


934 


i-48 


0.988 


0 . 777 


" a'^ci 


1-39 


0 .994 


0, 889 


944 


1-26 "* 


0 . 971 


0 . 782 


950 


1-29 


0.957 


0,845 


963 


1-20 


0,981 


0.900 


964 


1-20 


0.886 


0,558 


973 


1-16 


0.968 


0, 890 


980 


1-34 


0 . 961 


0.749 


981 


1-20 


0.9S3 


0,822 


984 


1-12 


0.938 


0 , 780 


lOlS 


1-22 


0.985 


0.854 


1040 


1-46 


0.977 


0.698 


1052 


1-18 


0.969 


0,842 


1059 


1-20 


0,927 


0.867 


106S 


1-33 


0.983 


0 .918 


11 06 9 


1-22 


0.993 


0.935 
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SEQ ID N6r 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS (MAXIMUM 
SCORE) 


MeanS (MEAN 
SCORE) 


1075 


1-27 


0.992 


0.934 


1080 


1-19 


0.331 


0.829 


1092 


1-19 


0.991 


0.973 


1094 


1-46 


0.992 


0-653 


1095 


1-30 


0.974 


0.929 


1105 


1-23 


0.994 


0.921 


1123 


1-35 


0.987 


0-658 


1138 


1-32 


0.954 


0.613 


1140 


1-39 


0.989 


0.789 


1142 


1-33 


0,897 


O-S70 


1152 


1-25 


0.990 


0.962 


1170 


1-38 


0^977 


0.827 


1176 


1-20 


0.944 


0. 768 


1187 


1-20 


0 .988 


0,965 ' 


1189 


1-35 


0.967 


0.839 


1192 


1-46 


0.993 


0.638 


1193 


1-16 


0 .925 


0 - 710 


1197 


1-29 


0 .985 


0 . 853 


1208 


1-23 


0 .961 


0.940 


1225 


1-29 


0 . 941 


0 . 672 


1245 


1-19 


0 .986 


0.967 


1258 


1-29 


0 - 965 


6 -'861 


12S5 


1-22 


0 . 889 


Q ^ 


1266 


1-20 


0 .944 


0.809 


li76 


1-48 


0 .982 


0.862 


1292 


1-19 


0 .979 


0.933 


1296 


1-21 


0 .984 


0 . 944 


1297 


1-19 


0 .984 


0.953 


1332 


1-38 


0 .942 


0 . 653 


13S8 


X-18 


0 .947 


b . i595 


1371 


1-33 


0 .957 


0 , 789 ■ ■■ - 


1380 


■■ i-26' 


0,Bl9 


0 ,304 


1397 


1-27 


0 .962 


0.777 


1399 


1-23 


0 .997 


0 .360 


1404 


1-24 


0 .998 


0.977 


1410 


1-15 


0.946 


0.84S 


1414 


1-24 


0.913 


0.588 


141S 


1-19 


0.982 


0.929 


1416 


1-12 


0.931 


0.891 


1418 


1-30 


0 . 933 


0.S63 


1420 


1-20 


0.881 


0.561 


1421 


1-19 


0.990 


0.969 


1423 


1-17 


0 .968 


0.863 


1424 


1-21 


0.885 


0.591 


1425 


1-24 


0.913 


0.588 


1426 


1-24 


0.913 


0.588 


1428 


1-25 


0.957 


0,89$ 


1430 


1-34 


0.977 


0 .819 


1431 


1-28 


0,979 


0.923 


1432 


1-36 


0.957 


0.613 


1433 


1-32 


0.921 


0,753 


1434 


1-39 


0.983 


0.621 


1435 


1-25 


0 .910 


0.631 


1436 


1-42 


0.988 


0.868 


1437 


1-22 


0.999 


0.980 


1442 


1-20 


0.918 


0.753 


1448 


1-12 


0.931 


0.891 


1462 


1-18 


0.968 


0.888 ~ 


1490 


1-20 


0.881 


0.561 


ISIB 


1-17 


0.968 


0.863 


1525 


i-21 


0.885 


0.591 


1S47 


1-28 


0 .974 


0.891 


1561 


1-25 


0,967 


0.899 


1580 


1-17 


0 .923 


0.824 


1593 


1-28 


0.979 


0.923 
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SEQ ID NO: 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS (MAXIMUM 
SCORE) 


fie «^ no vnii»AN 
SCORE) 


1596 


1-16 


0-929 


0.709 ~~ 


1601 


1-36 


0.9S7 


0.613 


1606 


1-22 


0.979 


0 .831 


1607 


1-20 


0 .974 


0.770 


1606 


1-32 


0-921 


0.753 


1614 


1-33 


0.969 


0.829 


1616 


1-20 


0.959 


0*869 


1625 


1-39 


0.983 


0.621 


1632 


1-25 


0.910 


0.631 


1636 


1-33 


0.897 


0,591 


1639 


1-42 


0.988 


0.868 


X645 


1-20 


0.927 


0.568 


1647 


1-17 


0 .923 


0.742 


1648 


1-22 


0.998 


0.980 



TRADOCS: 14 16234. J (%CR%01 !.DOC) 



i 
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TABLE 6 



SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of con tig 
nucleotide 
sequence 


SEQ ID 
NO: 

ot contig 

peptide 

secruence 


Priority- 
doc Jcet number 
corresponding 
SEQ ID NO: in 
priority- 
application 


i SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


I 


1787 


3573 


5359 


784CXP2 1 


1103 


2 


1788 


3574 


5360 


784CIP2 2 


2673 


3 


1789 


3 57.5 


5361 


784CIP2 3 


4117 


4 


1790 


3576 


5362 


784CIP2 4 


5556 


6 


1791 


3577 


5363 


784CIP2 5 


5S62 


6 


1792 


3578 


5364 


784CIP2_6 


5562 


7 


1793 


3579 


5365 


784CrP2 7 


5562 


8 


1794 


3580 


5366 


784CIP2 8 


5562 


9 


1795 


3581 


5367 


784CIP2 9 


5563 


10 


1796 


3582 


5368 


784CXP2 10 


5564 


11 


1797 


3583 


5369 


784<^IP2 11 ■ 


5565 


12 


1798 


3584 


5370 


784CIP2 12 


5689 


13 


1799 


3585 


5371 


784CIP2 13 


5729 


14 


1800 


3586 


5372 


784CIP2 14 


5745 


IS 


1801 


3587 


5373 


784CIP2 IS 


5777 


x€ 


1802 


3588 


5374 


784CIP2 16 


5777 


17 


1803 


3589 


S37S 


7a4CIP2 17 


5789 


18 


1804 


3 590 


5376 


784CIP2 18 


5792 


19 


1805 


3591 


5377 


784CIF2 19 


5804 


20 


1806 


3592 


S37B 


784CIP2 20 


5805 


21 


1807 


3593 


5379 


784CIP2 21 


5805 


22 


1808 


3594 


5380 


7a4CIP2 22 


5844 


23 


1809 


3595 


5381 


784CIP2 23 


5644 


24 


1810 


3596 


5382 


784CIP2 24 


5850 


2S 


1811 


3597 


5383 


784CIP2 25 


58^7 


26 


1812 


3598 


5384 


784CIP2 26 


5973 


27 


1813 


3599 


5385 


784CIP2 27 


5995 


28 


1814 


3600 


5386 


784CIP2 28 


5995 


29 


1815 


3601 


5387 


784CIP2 29 


6005 


30 


181S 


3602 


5388 


7a4CIP2_30 


6007 


31 


1817 ' 


3603 


5389 


78401^^2 31 


6007 


32 


1818 


3604 


5390 


784CIP2 32 


6009 


33 


1819 


3605 


S3S1 


784CIP2 33 


6012 


34 


1820 


3606 


5392 


784CIP2 34 


6015 


35 


1821 


3607 


5393 


784CIP2 35 


6016 


1/r 

Jt3 


1822 


3608 


5394 


7a4CIP2 36 


6016 


3 7 


1823 


3609 


5395 


7B4CIP2 37 


6018 


J a 


1824 


3 610 


53S6 


7B4CIP2 38 


6018 


3 9 


182S 


3611 


5397 


704CIP2 39 


6018 


4 0 


1826 


3612 


5398 


7B4CIP2_40 


6023 


41 


1827 " ' 


3613 


5399 


784CIP2 41 


6070 


42 


1828 




34 uu 


7S4CIP2 42 


6081 


43 


1829 


3615 


5401 


784CIP2 43 


6089 


44 


1830 


3616 


5402 


784CIP2 44 


6116 


45 


1831 


3617 


5403 


784CIP2 45 ~ 


6118 


46 


" 1832 


3618 


5404 


784CIP2 46 


613 0 


47 


1833 


3619 


5405 


784CIP2 47 


6177 


48 


1834 


3620 


5406 


784CIP2 48 


6189 


49 


1835 


3621 


5407 


784CIP2 49 


6191 


50 


1836 


3622 


5408 


784CIP2 SO 


6204 


51 


1837 


3623 


5409 


784CIP2 51 . 


6204 


52 


1838 


3624 


5410 


784CIP2_52 


6284 


S3 


1839 


3625 


5411 


784CIP2 S3 


6367 


54 


1840 


3626 


5412 


784CIP2 54 


6436 


55 


1341 


3627 


5413 


784CIP2 55 


6442 


56 


1842 


362B 


5414 


784CIP2 56 


6445 


57 


1843 


3629 


5415 


784CIP2 57 


6457 


56 


1844 


3630 


5416 


7a4CIP2 58 


6458 


59 


1845 


3631 


5417 


7B4CIP2 59 


6458 
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SEQ ID NO; 
Of full- 
length 
nucleotide 
sequence 

60 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 
1846 


SEQ ID NO: 
of contig 
nucleotide 
sequence 

3632 


SEQ ID 
NO: 

of contig 

peptide 

sequence 

5410 


Priori ty 

corresponding" 
SEQ ID NO: in 
priority 
application 
784CIP2 60 


SEQ ID 
WO; in 
U.S,S.N. 
09/488,725 

6462 


61 
62 


1847 
1848 


3633 
3634 


5419 
5420 


784CtP2 61 
784CtP2 62 


6472 
64^9 


63 


1849 


3635 


5421 


784CrP2 63 


6499 


64 


1850 


3636 


5422 


784CrP2 64 


6505 


65 
66 


1851 
1852 


JOJ f 

_^ 3638 


5423 
5424 


784CXP2 65 


6534 ' 


67 


1853 


3639 


5425 


784CIP2 66 
7a4ClP2 ^7 


6534 
6540 


68 


1854 


3640 


5426 


784CIP2 68 


6550 


69 
70 


185S 
1856 


3641 
3642 


5427 
5428 


784CIP2 69 


6SS0 


71 


1857 


3643 


5429 


7a4CIP2 70 
784CIP2 71 


^592 

6645 * 


72 


1358 


3644 


5430 


784CIP2 72 


A *71 


73 


1959 


3645 


5431 


784C1P2 73 


6763 


74 


1850 


3646 


5432 


784CIP? 74 


CTg'^ 


75 


1351 


3647 


5433 




o /lib 


76 


1862 


3648 


5434 


784CIP2 76 


6824 


77 
78 


1863 
1864 


3649 
36S0 


543 5 
5436 


784CIP2 77 

7B4r'TP9 7fl 


6830 


79 


1865 


3651 


5437 




683 1 
6832 


80 


1866 


3652 


5438 




6834 


81 


1867 " 


3653 


54 3 9 




6834 


82 


1868 


3654 


5440 




6835 


83 


18SS> 


3655 


5441 


784CIP2 8^ 


— "gb"?"'? 


84 


1870 


3656 


5442 


784C1P2 ft4. 


6843 


85 


18 71 


3657 


5443 




6859 


66 


1872 


3658 


544 4 


784C1P2 86 


6915 


87 


1873 


3659 


5445 




693 2 


88 


1874 


3660 


5446 


784CIP2 88 


gg^7 ~ 


89 

96 


1875 
1875 


3661 
3 662 


5447 
5448 


784C1P2 89 


w?OX 


91 


r 2877 


3663 


5449 


784CIP2 90 
7B4CiP2 91 


Of / J 

6973 "" 


92 


1878 


3684 


5450 




7O"0"7 




1879 


3665 


5451 


784CIP2 94 


7018 


94 


1880 


3666 


5452 


784CIP2 95 


7019 


95 


1881 


3667 


5453 


784C1P2 96 


7Q20 


96 


1882 


3668 


5454 


784CIP2 97 


7020 


97 


1883 


■" 36^9 


5455 


784CIP2 98 


7021 


98 


1884 


3^70 


54S6 


784CIP2 99 


7023 


99 


1685 


3671 


5457 


7B4CIP2_100 


7027 


100 


1886 


367Z 


5458 


784CIP2_101 


7028 


101 


1887 


3673 


5459 


784CIP2 102 


7029 


102 


1888 


3674 


5460 


784CIP2 103 


7031 


103 


1889 


3675 


5461 


784CIP2 104 


7032 


104 


1890 


3676 


5462 


784CIP2 105 


7033 


105 


1891 


3677 


5463 


784CIP2__106 ■' 


' 7035 


106 


1892 


3678 


5464 


784CIP2 107 


7036 


107 


1893 


3679 


5465 


/ o*t\^XirZ XUo 


7039 


108 


1894 


3680 


5466 


784CIP2_109 


7043 


109 


1895 


3681 ■ — - 


5467 


784C1P2 110 


7044 


110 


189^ 


3682 


5468 


784CIP2 ill - 


7046 


111 


1897 


3683 


5469 


784CIP2 112 


7054 


112 


1898 


3684 


5470 


784CIP2 113 


7061 


113 


1899 


3685 


'54 71' 


784C1P2 114 


7077 


|_ 114 


1900 


3686 


5472 


784CIP2 lis 


7092 


115 


1901 


3687 


5473 


784CIP2_116 


7094 


116 


1902 


3688 


5474 


784CIP2 117 


7106 


117 


1903 


3689 " 


5475 


784CIP2 118 


7107 


118 


1904 


3690 


5476 


784CIP2 11!9 " 


7111 


119 
120 


1905 
1906 


3691 
3692 


5477 
5478 


784CIP2 120 


7123 


121 


1907 


3693 


5479 


784CIP2 121 
784CIP2_122 


7142 
7142 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ 10 
NO: 

of contig 

peptide 

sequence 


Priori ty 
docket number_ 
corresponding 
SEQ ID NO; tn 
priority 
application 


SEQ ID 
NO; in 

09/488,725 


122 


1908 


3694 


5480 


784CIP2 123 


7154 


123 


1909 


3695 


5481 


784CIP2 124 


7160 


124 


1910 


3696 


5482 


784CIP2 125 


7169 


125 


1911 


3697 


5483 


7a4CIP2 126 


7185 


126 


1912 


3698 


5484 


78dCIP2 127 


7197 


12? 


1913 


3699 


5485 


704CIP2 128 


7219 


128 


1914 


'3700 


5466 


784CIP2 129 


7226 


12$ 


1915 


3701 


5487 


784CIP2 130 


7229 


130 


1916 


3702 


5488 


784CIP2 131 


7234 


131 


1917 


3703 


5489 


784CIP2_132 


7235 


132 


1918 


3704 


5490 


784CIP2 133 


7235 


133 ■ 


1919 


3705 


5491 


784CIP2 134 


7238 


134 


1920 


3706 


5492 


784CIP2 135 


7247 


13 5 


1921 


3707 


5493 


784CIP2 136 


7261 


136 


1922 


3708 


5494 


784CIP2 137 


7262 


137 


1923 


3709 


5495 


784CIP2 138 


7267 


138 


1924 


3710 


54 96 


784CIP2 139 


7272 


139 


1925 


3711 


5497 


784CIP2 140 


7273 


14 0 


1926 


3712 


5498 . 


784CXP2 141 


7282 


141 


1927 


3713 


5499 


784CIP2_142 


7288 


142 


a9:?B 


3714 


5500 


784Clt»2 143 


^i9i 


143 


1929 


3715 


5501 


7a4CIP2 144 


7293 


144 


1$30 


3726 


• 5502 


7a4CIP2 14 5 


7294 


14 S 


1931 


3717 


5503 


784CIP2 146 


7299 


146 


1932 


3718 


SS04 


784CIP2_147 


7300 


147 


1933 


3719 


5505 


784CIP2 148 


7312 


14 8 


1934 


3720 


5506 


784CIP2 149 


7313 


149 


1935 


3721 


5507 


784CIP2 150 


7315 


ISO 


1936 


3722 


SSC8 


7a4CIP2_151 


7318 


151 


1937 


3723 


5509 


784CIP2 152 


7321 


152 


193 3 


3724 


5510 


784CIP2 153 


7330 


1S3 


1939 


372S 


5511 


7e4CIP2_154 


7331 


154 


1940 


3726 


5512 


. ■ 784CIP2 155 


7333 


IS 5 


1941 


3727 


5513 


784CIP2 156 


73S0 


1S6 


1942 


3728 


5514 


784CIP2 157 


7352 


157 


1943 


3729 


5515 


784CIP2 158 


7384 


158 


1944 


3730 


5516 


784CIP2_1S9 


7403 


159 


1945 


3731 


5517 


7S4C1P2 160 


7431 




160 


1946 


3732 


5518 


784CIP2 161 


7441 




161 


1947 


3733 


5519 


784CIP2 162 ~ 


7453 




162 


1948 


3734 


5520 


784CIP2 163 


7467 




163 


194 9 


3735 


5521 


784CIP2 164 


7471 


164 


1950 


3 736 


5522 


784CIP2 165 


■ 7493 




165 


19S1 


3737 


5523 


784CIP2 166 


7502 




166 


1952 


3 73 8 


5524 


784CIP2 167 


7511 




167 


1953 


3739 


5525 


784CIP2 168 


7514 




168 


1954 


3740 


5526 


784CIP2 169 


7520 ~ 




169 


1955 


3741 


5527 


784CIP2 170 


7541 




170 


1956 


3742 


5528 


784CIP2 171 


7570 




171 


1957 


3743 


5529 


784CIP2 172 


7578 




172 


1958 


3UA 


5530 


784CIP2_173 


7583 




173 


1959 


^" 3 745 


5531 


7S4CIP2 174 


7592 


174 


1960 


3746 


5532 


784CIP2 175 


7601 




175 


1961 


3747 


5333 


784CIP2 176 


7602 




176 


1962 


3748 


5534 


784CIP2 177 ~ 


7608 




177 


1963 


3749 


5535 


784CIP2 178 


7615 — 




178 


1964 


3750 


5536 


784CIP2_179 


7617 




179 


1965 


3751 


5537 


784CIP2 181 


7624 




180 


1966 


3752 


5S38 


784CIP2 182 


7626 




181 


1967 


3753 


5539 


784CIP2 183 


7640 




182 


1968 


3754 


5540 


784CIP2 184 


7641 


] 183 


1969 


3755 


5541 


784CIP2 18S 


7641 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

seq[uence 


Prioiri t y 
docket number 
corresponding 
SEQ ID NO: in 
priority 
application 


.SEQ ID 
NO ; in 
U.S. S.N. 
09/488, 725 


184 


1970 


3756 


5542 


784CIP2 186 


7641 


185 


1971 


3757 


5543 


784CIP2 187 


7642 


186 


1972 


3758 


5544 


784CIP2 188 


7649 


187 


1973 


3759 


S545 


784CIP2 189 


7656 


188 


1974 


3760 


5546 


784CIP2 190 


7657 


189 


1975 


3761 


5547 


784CIP2 191 


7657 


190 


1976 


3762 


5548 


784CIP2 192 


7662 


191 


1977 


3763 


5549 


784CIP2 193 


7668 


192 


1978 


3764 


5550 


784CIP2 194 


7673 


193 


1975 


3765 


5551 


784CIP2_19S ! 7690 


194 


1980 


3766 


5552 


784CIP2_196 7700 


195 


1981 


3767 


5SS3 


784CIP2_197 : 7709 


196 


1932 


3768 


5554 


/04C1P2 198 


7736 


197 . 


1983 


3769 


S555 


/ e> 4 V- i Ir<s jl 3 9 


7737 


198 


1984 


3770 


SS56 


/o4Lllrji zUu 


7744 


199 


1985 


3771 


5557 




7771 


200 


1986 


3 772 


5558 




7786 


201 


1987 


3773 


SSS9 




7791 


202 


1988 


3774 


5560 


/04L.i.JrZ ZU4 


7797 


203 


1989 


3775 


SB61 




7806 


204 


19^0 


3776 


5562 


7fl4fTPO OftC 
f ot\^Xtr4i 16 v> o 


7812 


205 


1991 


3777 


5563 




7812 


206 


1992 


3778 


5S64 


784CIP2 208 


7818 


207 


1993 


3779 


SS65 


784CIP2 209 


7822 


208 


1994 


3780 


5566 


784CIP2 210 


7827 


209 


1995 


3781 


5567 


784CIP2 211 


7830 


210 


1996 


3782 


5568 


784CIP2 212 


7835 


211 


1997 


3783 


5569 


784CIP2 214 


7840 


212 


1998 


3784 


5570 


784CXP2 215 


7858 


213 


1999 


3785 


5571 


784CIP2 216 


7858 


214 


2000 


3786 


5572 


7e4CIP2_2l7 


7861 


215 


2001 


3787 


SS73 


784CIP2_218 


7866 


216 


2002 


3788 


5574 


784CIP2_2ld 


78^8 


217 


2003 


3789 


5575 


784CIP2 220 


7896 


216 


2004 


3790 


5576 


7S4CIP2_221 


7898 


219 


2005 


3791 


5577 


784CIP2 222 


7900 


220 


2006 


3792 


SS78 


784CXP2 223 


7906 "~ 


221 


2007 


3793 


5579 


784CrP2 224 


7908 


222 


2008 


3 794 


5530 


784CXP2 225 


7909 


223 


2009 


3795 


5581 


784CIP2_226 


7917 


224 


2010 


3796 


5582 


784CIP2 227 


7932 


225 


2011 


3797 


5583 


784CIP2 228 


7940 


226 


2012 


3798 


5584 


784CIP2 229 


794 0 


227 


2013 


3799 


5585 


7a4CIP2 230 


7984 


228 


2014 


3800 


5586 


784CIP2_231 


7984 


229 


2015 


3801 


5587 


784CIP2_232 


8001 


230 


2016 


3802 


5588 


784CIP2 233 


8021 


231 


2017 


3803 


5589 


784CIP2_234 


8029 


232 


2018 


3804 


5590 


784CIP2 235 


8033 


233 


"5 n 1 Q 
z ux^ 


3805 


5591 


7a4CIP2 236 


8040 


234 


2020 


3806 


5592 


7e4CIP2 237 


8052 


235 


2021 


3807 


5593 


784CIP2_238 


8096 


236 


2022 


3808 


SS94 


784CIP2 239 


8096 


237 


2023 


3809 


5595 


7a4CIP2 240 


8113 


238 


2024 


3810 


5596 


784CIP2_241 


8126 


239 


2025 


3811 


5597 


784CIP2_242 


8132 


240 


2026 


3812 


5598 


784CIP2_243 


8137 


241 


2027 


3813 


5599 


784CIP2 244 


8137 


242 


2028 


3814 


5^00 


7a4CIP2 2*45 


8159 


243 


2029 


3815 


5501 


784CIP2 246 


8159 


244 


2030 


3816 


5602 


784CIP2 247 


8161 


245 


2031 


3817 


5603 


784CIP2 248 


8176 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: " 


SEQ ID 


Priority 


SEO ID 


Of full- 


NO: of 


of contig 


NO: 


docket number^^ 


NO: in 


length 


full- 


nucleotide 


Of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 
sequence 




sequence 


priority 
application 




246 


2032 


3818 


5604 


784CIP2_249 


8196 


247 


2033 


3819 


S60S 


784CIP2_2S0 


8200 


248 


2034 


3820 


5606 


784CIP2 251 


8212 


249 


2035 


3821 


5607 


784CIP2„252 


8220 


2S0 


2036 


3822 


5608 


784CIP2_2S3 


8238 


25X 


2037 


3823 


5609 


7 84C1P2_2S4 


82S4 


252 


2038 


3824 


5610 


784CIP2_2S5 


82SS 


253 


2039 


3825 


5611 


784CIP2_2S6 


8288 


2S4 


2O40 


3826 


5612 


784CIP2__257 


8296 


2SS 


2041 


3827 


5613 


784CIP2_2S8 


8329 


256 


2042 


3828 


5614 


784CIP2__2S9 


8362 


257 


2043 


3829 


5615 


784CIP2_260 


8429 


258 


2044 


3830 


5616 


784CIP2_261 


8436 * 


2S5 


2045 


3831 


5617 


784CrP2„262 


8448 


260 


2046 


3832 


5618 


784CIP2_263 


8472 


261 


2047 


3833 


5619 


784CIP2_264 


8502 


262 


2048 


3834 


5620 


7 84GIP2_26 5 


8504 


263 


2049 


3835 


5621 


784CIP2 266 


8507 


264 


2050 


3836 


5622 


' 784Ci^i i68 


8509 


2^5 


2051 


3837 


5623 


784CIP2 269 


8515 


266 


2052 


3838 


5624 


784CIP2_270 


8519 


267 


2053 


3839 


5625 


784CIP2_271 


8530 


268 


2054 


3840 


5626 


784CIP2_272 


8532 


2G9 


205^ 


3841 


5627 


784CIP2_273 


8532 


270 


2056 


3842 


5628 


784CIP2_274 


8539 


271 


2057 


3843 


5629 


784CIP2 275 


8541 


272 


2058 


3844 


5630 


784CIP2_276 


8543 


273 


2059 


3845 


5631 


7a4CIP2 277 


8593 


274 


2060 


3846 


5632 


784CIP2 278 


8595 


275 


2061 


3847 


5633 


784CIP2_279 


8615 


276 


2062 


3848 


5634 


7fi4CIP2 280 


8620 


277 


2063 


3849 


5635 


"* 784CIP2 281 


8621 


278 


2064 


3850 


5636 


w'-^ 784CIP2 282 


8623 


279 


2065 


3851 


563 7 


704CIP2_283 


8625 


280 


2066 


3852 


5638 


784CIP2_284 


8628 


281 


2067 


3853 


563 S 


7a4CIP2_285 


8628 


2B2 


206B 


3 8 54 


5640 


784CIP2.286 


8629 


283 


2069 


3855 


5641 


784CIP2 287 


8630 


284 


2070 


3856 


5642 


784CIP2 288 


8631 


285 


2071 


3857 


5643 


784CIP2_289 


8633 


286 


2072 


3858 


5644 


784CXP2_290 


8^34 


287 


2073 


3859 


5645 


784CIP2 291 


8635 


288 


2074 


3860 


5646 


784CIP2_292 


8636 


269 


2075 


3861 


3647 


784CIP2_293 


8659 


290 


2076 


3862 


5648 


784CIP2 294 


8660 


291 


2077 


3863 


5649 


784GIP2 295 


8667 


292 


2078 


3864 


5650 


784CIP2_296 


8667 


293 


2079 


3865 


S651 


784CIP2 297 


8685 


294 


2060 


3866 


5652 


784CIP2_298 


8805 


295 


2061 


3867 


5653 


784GIP2 299 


8896 


296 


2082 


3868 


5654 


7e4CIP2 300 


8978 


297 


2083 


3869 


5655 


784CIP2 301 


9046 


298 


2084 


3870 


5656 


784CIP2 302 


9048 


299 


2085 " 


3871 


5657 


784CIP2 303 


9116 


300 


2086 


3872 


5658 


784CIP2 304 


9195 


301 


2087 


3873 


5659 


7B4CIP2 305 


9201 


302 


2088 


3874 


5660 


784CIP2 306 


9307 


303 


2089 


387S 


5661 


784CIP2 307 


9321 


304 


2O90 


3876 


5662 


7B4CIP2 308 


9397 


305 


2091 


3877 


5663 


7e4CIP2 309 


9405 


306 


2092 


3878 


5664 


784CIP2 310 


9406 


307 


2093 


3879 


5665 


784C1P2 311 


9422 
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SEQ ID NO: 

of full- 
length 
nucleotide 
sequence 

308 


SEQ ID 
NO: of 
full- 
length ' 
peptide 
sequence 
2094 


SEQ ID NO:" 
of contig 
nucleotide 
sequence 

3880 


SEQ ID 
NO: 

of contig 

peptide 

sequence 

5666 


PJTioiri ty 
docket number 
corresponding 
SEQ ID NO: in 
priority 
application 
/o^t-XP2^312 


SKQ ID 
NO: in 
U. S . S .N. 
09/488, 725 

9494 


309 
310 


2095 
2096 


3881 
3882 


5667 
5668 


784CIP2 313 
784CIP2 314 ■ 


9512 
9632 


311 


2097 


3883 


5669 


784CIP2 315 


9661 


312 


2098 


3884 


5670 


784CIP2 316 


9664 


313 
314 


2099 
2100 


3885 
3886 


5671 
5672 


7 84CIP2 317 ■ 


i 9691 


315 


2101 


3 887 


5673 


• 784CIP2 318 
784CrP2_319 


! 9700 
9716 


316 


2102 


3888 


5674 


784CIP2 320 


9721 


317 


2103 


3S89 


5675 


784CIP2 321 


9870 


318 


2104 


3890 


567^ " 


784CIP2 322 


9887 


319 
320 
321 


2105 
2106 
2107 


3891 
3892 
3893 


5677 
5678 
5679 


784CIP2 323 
784CIP2 324 


9923 
9938 


322 
323 
324 
325 
326 
327 
328 
339 
330 
331 
332 


2108 

2109 

2110 

2111 

2112 

2113 

2114 

2115 " 

«x xo 

2117 


3894 
3895 
3896 
3897 
3898 
3899 
3900 
3901 
3902 
3903 


5680 
5681 
5662 
5683 
'"■ 5684 
~ S68S 
5686 
5687 
5688 
5689 


784CIP2 325 
784GIP2 326 
784CIP2 327 
784CIP2 328 
784CrP2 329 
784CIP2 330 
784CIP2 331 
784CIP2B 1 
784CIP2B 2 
784CIP2B 3 
784CIP2B 4 


9964 
10007 
10O09 
10046 
10156 
10276 
10283 
152 
167 
20s 
210 


333 
334 
335 
336 
337 
338 
339 
340 
341 
342 
343 


2118 
2119 
2120 
2121 
2122 
2i23 
2:124 
2125 
2126 
2127 
2128 
2129 


3904 
390S 
3906 
3907 
3908 
3909 

3911 
3^12 
3913 
3914 
3915 


5690 
5691 
5692 
5693 
5694 
S695 
5696 
5697 
5696 
B699 
5700 
5701 


784CIP2B 5 
784CIP2B 6 
784CIP2B 7 
784CIP2B 8 
784CIP2B 9 
784CIP2B 10 
784CIP2B 11 * 
784CrP2B 12 
784GIP2B 13 
784CIP2B 14 
784CIP2B 15 


225 
226 
254 
268 
293 
293 
293 
302 
311 
352 
r 358 


344 
345 
346 
347 


2130 
2131 
2132 
2133 


3916 
3917 
3918 
3919 


5702 
5703 
5704 
5705 


784CIP2B Ifi 
784CIP2B 17 
784CIP2B 18 
7a4ClP2B 19 
784CIP2B 20 ' 


368 
393 
477 
508 
508 


348 
349 
350 
351 


2134 
2135 
2136 
2137 


3920 
3 921 
3922 
3923 


5706 
5707 
5708 
5709 


784CIP2B 21 
784CIP2B 22 
784CIP2B 23 


515 
578 
588 


352 

353 

3S4 

355 

356 

357 

358 

359 

360 

361 

362 

363 

364 

36S 

366 

367 

368 

369 


2138 

2X3 9 

2140 

2141 

2142 

214i 

2144 

2145 

2146 

2147 

2148 

2149 

2150 

2151 

2152 

21S3 ' 

2154 

215S 


3924 
3925 
3926 
3927 
3928 
3929 
3930 
3931 
3932 
3933 
3934 
3935 
3936 
3937 
3938 
3939 
3940 
3941 


5710 
S711 
S712 
5713 
5714 
5715 
5716 
5717 
5718 
5719 
5720 
5721 
5722 
5723 
5724 
5725 
5726 
5727 


784CIP2B 24 
784CIP2B 25 
784CIP2B 26 
784CIP2B 27 "" 
784CIP2B 28 
784CIP2B 29 
784CIP2B 30 
7a4CIP2B 31 
784CIP2B_32 
784CIP2B 33 
784C1P2B 34 
784CIP2B 35 
7e4CIP2B 36 
7B4CIP2B 37 
784CIP2B 38 
7S4CIP2B 39 
7e4CIP2fl 40 
7B4CIP2B_41 
784CIP2B 42 


591 

593 

594 

619 

620 

654 

692 

753 

758 

787 

833 

638 

870 

891 

891 

921 

924 

932 

942 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ IP 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO; 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number 
corresponding 
SEQ ID NO: In 
priority 
application 


SEQ lb 
NO: in 
U.S.S .N. 
09/488,725 


370 


21S€ 


3942 


5728 


784CIP2B_43 


958 " 


371 


21S7 


3943 


5729 


784CIP2B 44 


968 


372 


2158 


3944 


5730 


784CIP2B 45 


992 


373 


2159 


3945 


5731 


784CIP2B 46 


1025 


374 


2160 


3946 


5732 


784CIP2B 47 


1074 


37S 


2161 


3947 


5733 


784CIP2B 48 


1104 


376 


2162 


3948 


5734 


784CIP2B 4 9 


1114 


377 


2163 


3949 


5735 


784CIP2B 50 


1144 


378 


2164 


3950 


5736 


7S4CrP2B 51 


1262 


379 


2165 


39S1 


5737 


784CIP2B 52 


1318 


380 


2166 


3 952 


573 8 


784CIP2B 53 


1319 


38X 


2167 


3 953 


5739 


784CIP2B 54 


i328 


382 


2168 


3954 


574 0 


784CIP2B 55 


1436 


383 


2169 


3955 


5741 


784CIP2B 56 


1464 


384 


2170 


3956 


5742 


784C1P2B 57 


1584 


385 


2171 


3957 


5743 


784CIP2B 58 


1617 


386 


2172 


39S8 


5744 


784CIP2B 59 


1724 


387 


2173 


3959 


5745 


784CIP2B 60 


1728 


386 


2174 


3960 


5746 


784CIP2B 61 


1772 


^89 


2175 


3961 


5747 


784CIP2B ^2 


1809 


390 


2176 


39^2 


5748 


784CIP2B 63 


1868 


391 


2177 


3963 


5749 


784CIP2B 64 


1898 


392 


2178 


3964 


5750 


784CIP2B 65 


1926 


393 


2179 


3965 


5751 


784CXP2B 66 


19^5 


394 


2180 


3966 


5752 


7a4CIP2B 67 


1967 


395 


2161 


3967 


5753 


784CIP2B 68 


1995 


396 


2182 


3968 


5754 


784CrP2B 69 


2005. 


397 


2183 


3969 


5755 


784CIP2B 70 


2027 


398 


2184 


3970 


3756 


784CIP2B ^1 


2055 


399 


2185 


3 971 


57S7 


784CIP2B 72 


2103 


400 


2185 


3 972 


5758 


7e4CIP2B 73 


2106 


401 


2187 


3 973 


B7SB 


784CIP2B 74 


2166 


402 


2188 


3974 


5760 




il75 ■ 


403 


2189 


^975"" 


5 761 


784CIP2B 76 


2176 


4 04 


2190 


3976 


5762 


784CIP2B 78 


2236 


4 OS 


2191 


3 977 


5763 


784CIP2B_79 


2250 


406 


2192 


2978 


5764 


784CIP2B_S0 


2306 . 


407 


2193 


3979 


- 5765 


784CIP2B 81 


2323 


408 


2194 


3980 


S7o6 


784CIP2B 82 


2340 


409 


2195 


3^81 


5767 


784CIP2B 83 


2371 


410 


2196 


3982 


5768 


784C3:P2B 84 


2399 


411 


2197 


3983 


5769 


784CIPiB 85 


2411 


412 


2198 


3984 


5770 


784C1P2B_86 


2428 


413 


2199 


3985 


5771 


784CIP2B_87 


2430 


414 


2200 


3986 


5772 


784CIP2B 88 


2439 


4 IS 


2201 


3987 


5773 


784CIP2B__89 


2447 


416 


2202 


3988 


5774 


784CIP2B 90 


2461 


417 


2203 


3989 


5773 


784CIP2B__91 


2487 


418 


2204 


3990 


5776 


784CIP2B 92 


2492 


419 


22 05 


3991 


5777 


784CIP2B 93 


2512 


420 


2206 


3992 


5778 


784CIP2B 94 


2564 


421 


2207 


3993 


5779 


784CIP2B 95 


2678 


422 


2208 


3994 


5780 


784CIP2B 96 


2816 


423 


2209 


3995 


5781 


784CIP2B 97 


2818 


424 


2210 


3996 


5782 


784CIP2B 98 


2819 


425 


2211 


3997 


5783 


784CIP2B_99 


2943 


426 


2212 


3998 


5784 


784CrP2B_100 


3137 


427 


2213 


3999 


5785 


784CIP2B 101 


3137 


428 


2214 


4000 


5786 


784CIP2B 102 


3160 


429 


2215 


4001 


5787 


784CIP2B 103 


3323 


430 


2216 


4002 


5788 


7e4CIP2B_104 


3360 


431 


2217 


4003 


5789 


784CIP2B 105 


3362 



4 
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A 
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BEQ ID NO : 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priori ty 


SEQ ID 


of full- 




of contig 


NO : 


docket number^ 


NO: in 




full- 


nucleotide 


of conti^ 


coiTJcespondincf 


O. S .S .N. 




length 




f^spcicie 


^pn Tn Mri - -in 




seguence 


peptide 




Sf*fn "1 on '^f^ 

^ VlCi i & 


{>3ri ojtA cy 






sequence 






application. 




432 


2218 


4004 


5790 


784CIP2B 106 


3 4 17 


433 


2219 


4005 


5791 


784CIP2B i07 


3418 


434 


2220 


4006 


5792 


784CIP2B 108 


3442 


435 


2221 


4007 


5793 


784CIP2B 109 


3442 — 


436 


2222 


4008 


5794 


784CiP2B lid 


3444 


437 


2223 


4009 


5795 


784CIP2B 111 




438 


2224 


4010 


5796 




3863 


439 


2225 


4011 


57 97 


7B4CIP2B 113 


4090 


440 


2226 


4012 


3 / ;*o 


/0*jH»Xf^i5 XXH, 


4 105 


441 


2227 


4013 


3 / y y 


^PAfTP^Fi TIC 


4142 


442 


2228 


4014 


^800 


'TR^r'TP^n lie 

/ O Tit- J. Jf ^0_X X O 


4 142 


443 


22 29 


4015 


5801 


'7ftAr*TD'>Cl 111 
' oftl^X r^O^X JL f 


4 149 


444 


2230 


A m c 


5802 


TQA^'TDOia "inn 


4196 


44 5 


2231 


%UjL / 


5S03 




4202 


446 


2232 


4018 


5804 


7o4CIirZB_120 


4274 




2233 


4019 


580S 


1 Q yl T Tl 1 1* "in 

/o4dr^B__^121 


4304 


44 8 


•5 O 1 A 


4020 


5806 


784CIP2B_122 


4306 


449 


223B 


4021 


5S07 


784CIP2B__123 


4311 1 


450 


s-—— 

Jo 


4022 


5806 


784CIP2B_124 


4321 




2237 


4023 


5809 


764CIP2B_^12S 


4323 


4S2 


2236 


4024 


5810 


7d4CIP2B 126 


4332 




2239 


4025 


5811 


784CIP2B 127 


4488 


4S4 


2240 


4026 


5812 


784CIP2B^128 


4588 


455 


2241 


4027 


5813 


784CIP2B_129 


5569 


4S6 


2242 


4028 


5814 


784CIP2B__130 


SS73 


*iO t 


2243 




5815 


7B4ClP2B^i31 


5577 




2244 


4030 


5816 


784CIP2B 132 


5579 


453 


224S 


4031 


5817 


1 O Jl /^T d O t "5 

/O^bUlP^B 1 J J 


5582 


460 


2246 


4032 


5818 


7o4CIP2B 134 


5583 


461 


2247 


4033 


5819 


■7 yi /^T TC^ D 1 ■I c 
/ b^V.lir'^zS J-Jb 


5584 


462 


2248 


4034 


5820 




5585 


463 


2249 


4035 


562 1 


T Byi r^T on 1 "J 1 


5S91 


464 


2250 


^uoo 


S822 


/ O % L. X JfZ ti_X J O 


— a-e-iv5 


465 


2251 


4037 


' ' " 5823 




5594 


466 


2252 


4038 


- s'e24 




SS94 


467 


2253 


4039 


5825 


784CIP2B 141 


5598 


466 


2254 


4040 


5026 


7fl4dP2B 149 


5602 


463 


2255 


4041 


5827 


784CIP2B 143 


5605 


470 


2256 


4042 


5828 


784CIP2B 144 


5608 


471 


2257 


4043 


5829 


784CIP2B 145 


5617 


472 


2258 


4044 


5830 


784CIP2B_146 


s^iCiS 


' 473 


2259 


4045 


5831 


784CIP2B 147 


5622 


474 


2260 


4046 


5832 


784CIP2B_148 


5623 


47S 


2261 


4047 


5833 


7S4CIP2B 149 


5624 


476 


2262 


4048 


5834 


784CIP2B 150 


5625 


477 


2263 


4049 


5835 


7e4CIP2B 151 


5627 


478 


2264 


4050 


5836 


784CIP2B_1S2 


5628 


479 


. 2265 


40S1 


5837 


784CIPaB 153 


5630 


480 


2266 


4052 


5838 


784CIP2B_154 


5632 


481 


2267 


4053 


5839 


784CIP2B__1SS 


5640 


462 


2268 


4054 


5840 


784CIP2B_156 


5641 


483 


2269 


4055 


5841 


784CIP2B__1S7 


5643 


484 


2270 


4056 


■ 5842 


784CIP2B_158 


5647 


485 


2271 


4057 


5843 


784CIP2B 159 


5643 


486 


2272 


4QS8 


■"' 5844 


784CIP2B_ie0 


5658 


487 


2273 


4059 


5845 


784CIP2B_161 


5659 


488 


2274 


4060 


5846 


78^CIP2B__162 


5667 


469 


22 75 


4061 


5847 


784CIP2B_163 


5672 


490 


2276 


4062 


" 5848 


784CIP2B 164 


5674 


Jl91 


2277 


4063 


5649 


784CIP2B 165 


5678 


492 


2278 


4064 


5850 


784CIP2B 166 


5680 


493 


2279 


4065 1 


5851 


784CIP2B 167 


5664 
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SEQ ID NO; 
Of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO; 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

secpience 


Priority- 
docket number^ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO:ia 
U.S. S.N. 
09/488,725 


494 


2280 


4066 


ses2 


7e4CIP2B X68 


S686 


495 


2281 


4067 


5853 


7a4CIP2B 169 


5694 


496 


2282 


4068 


5854 


7B4CIP2B 170 


5698 


497 


2283 


4069 


5855 


784CIP2B 171 


5699 


498 


2284 


4070 


5856 


784CIP2B 172 


5712 


499 


2285 


4071 


5857 


784CIP2B 173 


5719 


500 


2266 


4072 


5858 


784CIP2B 174 


5720 


501 


2287 


4073 


5859 


784CIP2B 175 


^727 


S02 


2288 


4074 


5860 


784CIP2a 176 


5730 


503 


2289 


4075 


E861 


7a4CIP2B 177 


5734 


504 


2290 


4076 


5862 


784CIP2B 178 


5738 


505 


2231 


4077 


5863 


784CIP2B 179 


5739 


506 


2292 


4078 


5864 


7a4CIP2B 180 


5740 


507 


2293 


4079 


S66S 


784CIP2B 181 


5744 


508 


2294 


4080 


5866 


784CIP2B 182 


5748 


509 


2295 


4081 


5867 


784CIP2B 183 


"■■ 5749 


510 


2296 


4082 


5868 


784CIP2B 104 


5750 


511 


2297 


4083 


5869 


784CIP2B 185 


5750 


512 


2298 


4084 


5870 


7B4CIP2B 186 


5750 


513 


2299 


4085 


5B71 


7B4CIP2B 187 


^76l 


514 


2300 


408^ 


S872 


7e4CIP2B 188 


5762 


SIS 


2301 


4087 


5673 


784CIP2B 189 


5767 


516 


2302 


4068 


5874 


7B4CIP2B 190 


5773 


517 


2303 


4089 


5875 


784CIP2B 191 


5783 


518 


2304 


4090 


5876 


7B4CIP2B 192 


5784 


519 


2305 


4091 


5877 


784CIP2B 193 


5788 


520 


2306 


4092 


5878 


784CIP2B 194 


5798 


521 


2307 


4093 


5879 


784C1P2B 196 


5807 


522 


2308 


4094 


5880 


784CIP2B 197 


5816 


523 


2309 


4095 


5881 


784CIP2B 198 


5819 


524 


2310 


4096 


5862 


784CIP2B 199 


5827 


525 


2311 


4097 


5883 


784CiP2B 200 


5828 


526 


2312 


409S 


58S4 


784CIP2B_26l ■ 


^^42 


527 


23X3 


4099 


5885 


784CIP2B_202 


5853 


528 


23X4 


4100 


5886 


784CIP2B 203 


5861 


529 


2315 


4101 


5887 


7a4CIP2B 204 


5864 


530 


2316 


4102 


5886 


784CIP2B_20S " 


5865 


531 


2317 


4103 


5889 


784CIP2B 20^ 


5871 


532 


2318 


4104 


5890 


784C1P2B 207 


5873 


533 


2319 


4105 


5891 


784CIP2B 208 


5873 


534 


2320 


4106 


5892 


784CIP2B 209 


5875 


535 


2321 


4107 


5893 


784CIP2B 210 


5878 


536 


2322 


4108 


5894 


784CIPaB 211 


5879 


537 ~ ' 


2323 


4109 


5695 


784CIP2B 212 


5880 


S3 B 


2324 


4110 


5896 


784CIP2B 213 


5880 


539 


2325 


4111 


5897 


7e4CIP2B 214 


5880 


540 


2326 


4112 


5898 


784CIP2B 215 


5880 


" "541 


2327 


4113 


5899 


784CIP2B 216 


5865 


542 


2328 


4114 


5900 


784CIP2B 217 


5895 


543 


2329 


4115 


5901 




5898 


544 


2330 


4116 


5902 


784CIP2B 219 


5902 


545 


2331 


4117 


5903 


784CIP2B_220 


5904 


546 


2332 


4118 


5904 


784CIP2B_22l 


5918 


547 


2333 


4119 


5905 


784CIP2B 222 


5921 


548 


2334 


4120 


5906 


7e4CIP2B 223 


5927 


549 


2335 


4121 


5907 


7e4CIP2B_224 


5932 


550 


2336 


4122 


5908 


784CIP2B 225 


5933 


551 


2337 


4123 


5909 


784CIP2B 226 


5945 


552 


2338 


4124 


5910 


784CrP2B 227 


5946 


553 


2339 


4125 


5911 


784CrP2B 228 


5947 


554 


2340 


4126 


5912 


784CIP2B 229 


5956 


555 


2341 


4127 


5913 


784CIP2B 230 


5967 



279 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: of 


of contig 


NO: 


docket number^ 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S.S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO; in 


09/488,725 


sequence 


peptide 
sequence 




sequence 


priority 
application 




556 


2342 


4128 


5914 


784CIP2B_232 


59 VS 


557 


2343 


4129 


5915 


784CIP2B_233 ^ 


5977 


556 


2344 


4130 


5916 


784CIP2B 234 


5978 


559 


2345 


4131 


5917 


784CIP2B 235 


5979 


560 


2346 


4132 


5918 


784CIP2B_236 


5980 


S61 


2347 


4133 


5919 


784CIP2B 237 ' 


5988 


562 


2348 


4134 


5920 


7B4CIP2B 238 


5989 


563 


2349 


4135 


5921 


7e4CIP2B_239 


5991 


564 


2350 


4136 


5922 


784CIP2B_240 


5997 


565 


2351 


4137 


5923 


784CIP2B 241 


599 8 


566 


2352 


4138 


5924 


784CIP2B 242 


6003 


567 


2353 


4139 


5925 


784CIP2B_243 


6004 


568 


2354 


4140 


5926 


784CIP2B_244 


6013 


569 


2355 


4141 


5927 


784CIF2B_245 


6028 


570 


2356 


4142 


5928 


784CIP2B 246 


6028 


571 


2357 


4143 


5929 


784CIP2B 247 


6029 


572 


2358 


4144 


5930 


784CIP2B_248 


6031 


573 


2359 


4145 


5931 


7e4CIP2B_249 


6031 


574 


23^0 


4146 


5932 


784CIP2B 250 


6032 


575 


2361 


4147 


5933 


784CIP2B 251 


6037 


576 


2362 


4148 


5934 


784CIP2B_252 


6037 


577 


2363 


4149 


5935 


784CIP2B 253 


6043 


S7fl 


23^4 


4150 


5936 


784CIP2B 254 


6044 


S79 


2365 


4151 


5937 


7e4CIP2B_2SS 


6046 


580 


2366 


4152 


5938 


784CIP2B_2S6 


6048 


581 


2367 


4153 


5939 


784C1P2B__257 


6049 


582 


2368 


41S4 


5940 


784CIP2B 258 


66S1 


583 


2369 


4155 


5941 


784CIP2B 259 


6053 


584 


2370 


41S6 


5942 


784CIP2B 260 


6060 


585 


2371 


41S7 


5943 


784CIP2B 261 


6063 


586 


2372 


4158 


5944 


784CIP2B 262 


6066 


587 


2373 


4159 


S945 


784CIP2B 263 


6067 


588 


23 74 


4160 


5946 


, 784CIP2B_264 


€068 


589 


2375 


4161 


5947 


784CIP2B_265 


6073 


590 


2376 


4162 


5948 


784CIP2B_266 


6076 


591 


2377 


4163 


5949 


784CIP2B 267 


6076 


592 


2378 


4164 


5950 


784CIP2B 268 


6077 


593 


2379 


4165 


5951 


784CIP2B 269 


6079 


594 


2360 


4166 


5952 


784CXP2B 270 


6082 


595 


2381 


4167 


5953 


784CIP2B_272 


6086 


596 


2382 


4168 


5954 


784CIP2B_273 


6091 


597 


2303 


4169 


5955 


784CIP2B_274 


6094 


598 


2384 


4170 


5956 


784CIP2B_27S 


6101 


599 


2385 


4171 


5957 


784CIP2B_276 


6103 


€00 


2386 


4172 


5958 


784CIP2B 277 


6104 


601 


2387 


4173 


5959 


784CIP2B 278 


6108 


602 


238 8 


4174 


5960 


784CIP2B_279 


6112 


603 


2389 


4175 


5961 


784CIP2B 280 


6121 


604 


2390 


4176 


S962 


784CIP2B 281 


6125 


60S 


2391 


4177 


5963 


784CXP2B 282 


■ '6126 


606 


2392 


4178 


5964 


784CIP2B 283 


6126 


607 


2393 


4179 


5965 


7e4CIP2B 284 


-6129 


608 


2394 


4180 


5966 


784CIP2B 285 


6133 


609 


2395 


4181 ' 


5967 


784CrP2B 286 


6133 


610 


2396 


4182 


5968 


784CIP2B 287 


6135 


611 


2397 


4183 


5969 


784CIP2B 288 


6139 


612 


2398 


4184 


5970 


784CXP2B_289 


6141 


613 


2399 


4185 


5971 


784CIP2B 290 


6145 


614 


2400 


4186 


5972 


784CIP2B_291 


6146 


615 


2401 


4187 


5973 


784CIP2B 292 


6148 


616 


2402 


4188 


5974 


784CIP2B 293 


6149 


617 


2403 - 


4189 


5975 


784CIP2B 294 


6149 
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SEQ ID NO: 
of full- 
iength 
nucleot i.de 


SEQ ID 
NO; of 
full- 
length 
peptide 
& e quen ce 


SJSQ ID NO: 
of con tig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket humber_ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 

u.s.s .w, 

09/488,725 


618 


24 04 


4190 


5976 


784CIP2B 295 


6153 


619 




4191 


5977 


784CIP2B 296 


6159 


620 




4192 


5978 


7e4CIP2B 297 


6164 


621 


2407 


4193 


5979 


784CIP2B 298 


6167 


622 


2400 


4194 


5980 


784CIP2B 299 


6172 


623 


2409 


4195 


5981 


784CIP2B 300 


6173 ~~ 


624 


2410 


4196 


5982 


784CIP2B 301 


6190 




2411 


4197 


5983 


7S4CIP2B 302 


6194 


626 


2412 


4198 


5984 


784C1P2B 303 


6196 


627 


2413 


4199 


5985 


784CIP2B 304 


6197 


628 


2414 


4200 


5986 


784CIP2B 305 


619B 


629 


2415 


4201 


5987 


7e4CIP2B 306 


6198 


630 


2416 


4202 


5988 


784CIP2B 308 


6214 


631 


2417 


4203 


5989 


784CIP2B 309 


6215 


632 


2418 


4204 


5990 


784CIP2B 310 


6219 


633 


2419 


4205 


5991 


784CIP2B 311 


6226 


634 


2420 


4206 


5992 


7e4CIP2B 312 


6229 


635 


2421 


4207 


5993 


784CIP2B 313 


6234 


636 


2422 


4208 


5994 


784CIP2B 314 


6237 


637 


2423 


4209 


S99S 


764CIP2B 315 


6238 


63d 


2424 


4210 


5996 


784CIP2B 316 


6239 


639 


2425 


4211 


5997 


784CIP2B 317 


6239 


640 


2426 


4212 


5998 


784CIP2B 318 


6239 


641 


2427 


4213 


5999 


784CIP2B 319 


6240 


642 


2428 


4214 


6000 


784CIP2B 320 


. 6244 


643 


2429 


4215 


6001 


784CIP2B 321 


6245 


644 


2430 


4216 


6002 


784CIP2B_322 


6250 


645 


2431 


4217 


6003 


7S4CIP2B 323 


6252 




243 2 


4218 


6004 


784CIP2B 324 


6252 


647 


2433 


4219 


6005 


784CIP2B_32S 


6256 


648 


2434 


4220 


6006 


784CIP2B 326 


6260 


649 


2435 


4221 


6007 


784CIP2B 327 


6261 


6 50 


2436 


4222 


6008 


784CIP2fe 326 


^2^4 


651 


2437 


4223 


6009 


784CIP2B 329 


6265 


652 


2438 


4224 


6010 


784CIP2B 330 


6266 


653 


2439 


4225 


6011 


784CIP2B 331 


6270 


654 


244 0 


4226 


6012 


784CIP2B 332 


6i71 




2441 


4227 


6013 


784.CIP2B 334 


6274 


656 




4228 


€014 


• 784CIP2B 335 


6276 


6S7 


2443 


4229 


6015 


784CIP2B 336 


6281 


658 


2444 


4230 


6016 


784CIP2B 337 


6281 


659 


2445 


4231 


6017 


784CIP2B 338 


6288 


660 


2446 


4232 


6013 


784CIP2B 339 


6292 


661 


i44'*7" 


4233 


6019 


784CIP2B 340 


6294 


662 


2448 


4234 


6 020 


/o4t.iPiiB 343 


6312 


663 


2449 


4235 




/a4CIP2B 344 


6312 


664 


24S0 


4236 






6312 


665 


2451 


4237 


6 023 




6322 


666 


2452 


4238 


6 024 


/04C1P2B 347 


6324 


667 


2453 


4239 


6025 


784CIP2fl 349 


6329 


663 


2454 


4240 


6026 


784CIP2B 350 


6331 


669 


2455 


4241 


6027 


784CIP2B 351 


633^ ■ 


670 


2456 


4242 


6028 


784CIP2B 352 


6334 


671 


2457 


4243 


6029 


'/84CIP2B 353 


6337 


672 


2458 


4244 


6030 


784CIP2B 354 


6339 


673 


2459 


4245 


6031 


784CIP2B_355 


6346 


674 


2460 


4246 


6032 


784CIP2B 356 


6348 


675 


2461 


4247 


6033 


784CIP2B 357 


^348 


676 


2462 


4248 


6034 


784CIP2B 358 


6350 


677 


2463 


4249 


6035 


784CIP2B 359 


6351 


678 


2464 


4250 


6036 


784CIP2B_360 


6355 


679 


2465 


4251 


6037 


784CIP2B_361 | 


6362 
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SEQ ID 


fc-v nu : 


SEQ ID 


Priority 


SEQ ID 


at full- 


NO: of 


of r^f^inH-ScT 


NO ' 


docket number 


NO: in 


length 


full- 


nucleotide 




^L^Xk JL O J^UilUJ. 11^ 


U. S S N 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 




sequence 


peptide 




secjuence 


priority 






sequence 






oippl i ca t i on 




seo 


2466 


42S2 


6038 


784CIP2B 362 


6368 


681 


2467 


4253 


6039 


784CIP2B 363 


6369 


682 


2468 


42S4 


6040 


784CIP2B 3^4 


6371 


683 


2469 


4255 


6041 


784CIP2B 365 


6376 


684 


2470 


42S6 


6042 


784CIJ>2B 366 


6379 


68S 


2471 


42S7 


6043 


784CIP2B 367 


63 80 


686 


2472 


4.258 


6044 


7fi4riPv*B 3 6fl 


6381 


687 


2473 


4259 


£045 


784CIP2B 369 


6392 


683 


2474 


4260 


6046 


7fl4r'rP9R "^70 




689 


2475 


4261 


6047 


7fi4PTPPn "^71 


/ 


690 


2476 


4262 


6048 


/ Csw JL XTiCu J 


Sinn 


691 


2477 


4263 


6049 




fC A nt 


692 


2478 


4264 


6050 




641X 


693 


2479 


4265 




n Q >t /^T noT3 "a 1 c 
/aH\ — lirZo J> tZ3 


6411 


694 


2480 


4266 


60S2 




6411. 


695 




A oat 


6053 


784C;IP2B 377 


6416 


6 96 




4268 


6054 


784CIP2B__378 


6418 


697 




4269 


6055 


784CIP2B_379 


6422 


696 




y u 


6056 ' 


794CIP2B_380 


6423 




24 85 


4271 


6057 


784CIP2B_^3ai 


6426 


700 


24 86> 


4272 


6058 


784C1P2B 382 


6427 


701 


24 67 


4273 


€ 059 


784CIP2B 383 


6428 


702 


24 38 


4274 


6060 


7o4CXP2B_j84 


6429 


703 


2489 


4275 


6061 


7 S 4 CI P2B__3 8 5 


6430 


704 


2490 




6062 


/84CIP2B__3 86 


6432 


70S 


24 91 


/ / 


6063 


784CIP2B 387 


6432 


706 






6064 


/o4ClF2B 388 


6438 


707 


2493 


4279 


6065 


/"•XT^m 'acta 


6441 


708 


2494 


4 280 


6066 




6446 


709 


2495" 


4281 


6067 




6454 


710 


2496 


4282 


6068 




6459 


711 


2497 


4283 


6069 






712 


2498 


4284 


6070 


TRACT Don 7 QC 


O^O / 


713 


2499 


4285 


6071 




6468 


714 


2500 


4286 


6072 


784CIP2B 397 


6487 


715 


2501 


4287 


6073 


7a4CTP2B 39fl 

f 0 V.« X hCt 1^ XJ J^<J 


6491 


716 


2S02 


4288 


6074 


7B4CIP2B 399 


5S(^^ " 


717 


2503 


4269 


6075 


784CXP2B 40i 


6514 


718 


2504 


4290 


S076 


7e4GIP2B 402 


6519 


719 


2505 


4291 


6077 


784CIP2B 403 


6521 


720 


2506 


4292 


6078 


784CIP2B 404 


6532 


721 


2507 


4293 


6079 


784CIP:iB 465 


6536 


722 


2508 


4294 


6060 


7B4CIP2B 406 


6543 


723 


2509 


4295 


6081 


784CIP2B 407 


6544 


724 


2510 


4296 


6082 


764CIP2B 408 


6548 


72S 


2511 


4297 


6083 


784CIP2B 409 


6551 


726 


2512 


4298 


6084 


784CIP2B 410 


6551 


727 


2513 


4299 


6085 


784CIP2B_411 


6SS2 


728 


2514 


4300 


6086 


784CIP2B_412 


6554 


729 


2S1S 


4301 


6087 


784CIP2B 413 


^S56 


730 


2516 


" 4302 ■ 


6088 


784CIP2B 414 


6560 


731 


2517 


4303 


6089 


784CIP2B_41S 


6563 


732 


2518 


4304 


6090 


784CIP2B 416 


6564 


733 


2519 


4305 


6091 


784CIP2B 417 


6567 


734 


2520 


4306 


60^2 


784CIP2B 418 


6573 


735 


2521 


4307 


6093 


784CIP2B„419 


6575 


736 


2522 


4308 


6094 


784CIP2B_420 


6577 


737 


2523 


4309 


6095 


784CIP2B 421 


6593 


738 


2524 


4310 


6096 


784CIP2B 422 


6595 


739 


252 S 


4311 


6097 


784CIP2B 423 


6599 


740 


2526 


4312 


6098 


784CIP2B 424 


6625 


741 


2527 


4313 


6099 


7a4CIP2B 425 


6625 



f ■ 
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SEQ ID NO: 
of full- 
lengtih. 

1 * vi J. \J \« J. UCS 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number 
corresponding 
SBQ ID NO: in 
priority 
application 


SBQ ID 
NO; in 
U.S. S.N. 
09/488,725 


742 






6100 


784CIP2B 426 


6626 


743 


2529 




6101 


784CIP2B_427 


6630 


744 


2 53C 


431 6 


6102 


784CIP2B_428 


6631 


745 




4317 


6103 


784Cli'iB 429 


^632 


- 74 6""" ' 


2 532 


4318 


6104 


784C1P2B 430 


6633 


747 




43l5 


6105 


784CIP2B 431 


6634 


74 8 


2534 


4320 


6106 


784CIP2B 432 


6638 




2535 


4321 


6107 


784CIP2B 433 


6641 




2536 


4322 


6108 


784CIP2B 434 


6644 




2537 


4323 


6109 


784CIP2B 435 


6646 


752 


2538 


4324 


6110 


784CIP2B„436' ■ 


6648 


753 


2S39 


4325 


6111 


784CIP2B 437 


6^52 


7S4 


2540 


4326 


6112 


784CIP2B 438 


6654 


755 


2S41 


4327 


6113 


784CIP2B 439 


6657 


756 


2542 


4328 


6114 


784CIP2B 440 


6658 


757 


2543 


4329 


6115 


784CIP2B 441 


6663 


758 


2544 


4330 


6116 


784CIP2B 442 


6664 


•lb 9 


2545 


4331 


6117 


784CIP2B 443 


6668 


760 

^'^-J 


2546 


4332 


6118 


784CIP2B 444 


6669 


7ol 


2S47 


4333 


6119 


784CIP2B 445 


6673 


762 


2548 


4334 


6120 


784CIP2B '44^ ' ' 


6685 


763 


2549 


4335 


6121 


784CIP2B 447 ' 


6687 


764 


2530 


4336 


6122 


784CIP2B 448 


6669 


765 




2551 


4337 


6123 


784CIP2B 449 


6693 


76o 


2552 


4338 


6124 


784CIP2B 450 


6698 


767 


2553 


4339 


6125 


784CIP2B 451 


6699 


768 


25S4 


4340 


6126 


784CIP2B 452 


6705 


769 


2555 


4341 


6127 


784CIP2B 453 


6711 


770 


2556 


4342 


6128 


784CIP2B 454 ' 


6713 


771 


2557 


4343 


6129 


784CIP2B_455 


■ ■' 6716 


772 


255B 


4344 


6130 


784CIP2B_456 


6725 


773 


2559 


4345 


6131 


784CIP2B 457 


6726 


774 


2560 


4346 


6132 


7e4CIP2B_^458 ^ 


6727 


TJfs 

i 


2561 


4347 


6133 


784CIP2B_459 


6730 


776 




4348 


6134 


784CIP2B_460 


6730 


77^ 




4349 


6135 


7a4CIP2B 461 


6730 


778 




4350 


5136 


784CIP2B 462 


6732 


773 


2565 


4351 


6137 


784CIP2B 463 


6733 


780 


2566 "" 


43 52 


6138 


' 784CIP2B 464 


6737 


781 


2567 


4353 


6139 


784CIP2B 4 65 


674S 


783 


2568 


43 54 


6 140 


7a4CIP2B 466 


6751 


783 


2569 


4355 


6 141 


784CIP2B 467 


6754 


784 


2S70 


4355 


6 142 


784CIP2B 466 


6758 


785 


257i*"' 


43 57 


6143 


784CIP2B 469 


6761 


786 


2572 


4358 




784CIP2B 470 


6765 


787 


2573 


4359 




784CIP2B 471 


6768 


788 


2574 


4360 


6146 


IQjlf^TdD >t *t 

/o4l — Lk*j£ii 4/2 


6773 


789 


2575 


4361 




784CIP2B 4 73 


6776 


790 


2576 


4362 


6 146 


/D4CIP2B 474 


6796 


791 


2577 


4363 


6149 


784CIP2B 475 


6798 


792 


.2578 


4364 


6150 


7a4CIP2B 476 


6823 


793 


2579 


4365 


6151 


7a4CIP2B 477 ' 


6825 


794 


2580 


4366 


6152 


784CIP2B 478 


6826 


795 


2501 " 


4367 


6153 


784CIP2B 479 


6839 


796 


^562 


4368 


6154 


784CIP2B 480 


6844 


797 


2583 " " 


43^9 


6155 


784CIP2B 482 


6849 


798 


2584 


4370 


6156 


784CIP2B__483 


6854 


799 


2585 


4371 


6157 


784CIP2B 4 84 


6857 


800 


2586 


4372 


6158 


784CIP2B_48S 


6861 


801 


2587 


4373 


6159 


784CIP2B 486 


6873 


802 


2588 


4374 


6160 


784CIP2B 487 


6875 


uoj 1 «it>89 


4375 


6161 


784CIP2B 488 


6877 
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of full- 
length 
nucleotide 
sequence 


S£Q ID 
NO: of 
full- 
length 
peptide 
sequence 


Xii NO : 

nucleotide 
sequence 


SEQ ID 
NO * 

of conti.g 
pep t i de 
sequence 


Priority 
docket number 
correspondin^r 

priority 
appl ication 


SEQ ID 
NO : in 
U,S .S ,N. 
09/488, 725 


804 


2590 


4376 


6162 


784CIP2B 489 


, . 

6680 


805 


2531 


4377 


6163 


784CIP2B 490 


6865 


806 


2592 


4378 


6164 


784CIP2B 491 


6890 


807 


2593 


4379 


6165 


784CIP2B 492 


6890 


808 


2594 


4380 


6166 


784CrP2B 493 " 


6894 


809 


■ 2595 


4381 


6167 




6901 


810 


259S 


4382 


6168 


784CrP?R 


6904 


811 


2597 


4383 


6169 




6907 


812 


2598 


4384 


6170 




6914 


813 


2599 


4385 


6171 


ifidCTCJTx Aaa 


6917 


814 . 


2600 


4386 


6172 


<o*v-XJf^o 435* 


6923 


815 


2601 


4387 


6173 


/04twXirJo rtUO 


6929 


816 


2602 


4388 


6174 


/o^tuXr'ieo 9UX 


6931 


817 


2603 


4389 


6175 


/tt4CIP2B 502 


6935 


818 


2604 


4 390 


6 176 


784CIP2B 503 


6940 


815 


2605 


43 91 


ox// 


V84CIP2B 504 


6945 


820 


2606 




6178 


7e4CIP2B 505 


6946 


821 


2607 




6 179 


784CIP2B 506 


6947 


822 


2608 


43 94 


6180 


784CIP2B 507 


6949 


823 


2609 


4395 


6181 


7a4CIP2B 508 


6959 


824 


2610 


43 96 


6 182 


784CIP2B 509 


6960 


825 


2611 


4397 


6183 


784CIP2B SIO 


6962 


826 


2612 


439B 


6164 


784CIP2B Sll 


6963 


827 


2613 




6185 


784CIP2B 512 


6967 


828 


2614 


4 40O 


6186 


784CIP2B 513 


6983 


829 


2615 


4401. 


^ 1 Q '7 


7S4CIP2B 514 


6988 


830 


2616 


4402 




784CIP2B 515 


6996 


B31 


2617 


4403 


61B9 


/Q4CIP2B 515 


7003 


632 


2618 


4404 


6190 


/04CXP2B 517 


7016 


833 


2619 


4405 


6191 


/ 0 4 K^WZa 518 


7017 


834 


2620 


4406 


6192 


tO'kK^XVAtA ol9 


7025 


835 


2621 


4407 


6193 




7025 


836 


2622 


4408 


6194 


/o^ ^wX«rZo__D^x 


7025 


837 


2623 


4409 


6195 




7050 


838 


2624 


4410 


6196 




7051 


839 


2625 


4411 


6197 




7055 


840 


2626 


4412 


6198 




7060 


841 


2627 


4413 


6199 


784C1P2B ^^fi 


/U b4 


842 


2628 


4414 


6200 


• 784CXP2B 527 ~ 


7067 


843 


2629 


4415 


6201 


784CIP2B 528 


7071 


844 


2650 


4416 


6202 


7e4CIP2B 529 


7072 


B4S 


2631 


4417 


6203 


784CIP2B 530 


7073 


846 


2632 


4418 


6204 


784CIP2B 531 " 


7076 


847 


2633 


4419 


6205 


7e4CIP2B 532 


708B 


848 


■ 2634 


4420 


6206 


784CIP2B 533 


7089 


849 


2635 


4421 


6207 


784CIP2B 534 


7091 


850 


2636 


4422 


6208 


784CIP2B 535 


7091 


851 


2637 


4423 


6209 


7B4CIP2B 536 


7104 


852 


2638 


4424 


6210 


7a4CIP2B 537 


7105 


853 


2639 


4425 


6211 


784CIP2B_536 


7105 


854 


2640 


4426 


6212 


784CIP2B 539 


7109 


855 


2641 


4427 


6213 


784CIP2B 54d 


7109 


856 


2642 


4428 


6214 


784CIP2B 541 ^ 


7119 


857 


2643 


4429 


6215 


784CIP2B 542 


7120 


656 


2644 


4430 


6216 


784CIP2B 543 


7121 


859 


2645 


4431 


6217 


784CIP2B 544 


7126 


860 


2646 


4432 


6216 


784.CIP2B_54S 


7127 


861 


2647 


4433 


6219 


7e4CIP2B_546 


7130 


862 


2648 


4434 


6220 


784CIP2B 547 


7131 


863 


2649 


4435 


6221 


784CIP2B 548 


7144 


664 


2650 


4436 


6222 


784CIP2B 549 


7159 


865 


2651 


4437 


6223 


784CIP2B 550 


7163 
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of full- 
lencfth 
nucleotide 
eequence 


SEQ ID 
NO ; of 
full- 
length 
peptide 
sequence 


olf contiQ' 

nucleotide 

sequence 


SEQ XD 
NO: 

of con tig 

peptide 

sequence 


Priority 
docket number 
corresponding 
SEQ ID NO: in 
priority 
appl ication 


SEQ ID 
NO : in 
U . S . S . N . 

HQ /d nn 7!3 t; 


866 


2652 


4438 


6224 


784CIP2B 551 


7175 


867 


2653 


4439 


6225 


7S4CIP2B 552 


7188 


868 


2654 


4440 


6226 


784CIP2B 553 


71 8 9 


869 


2655 


4441 


6227 


7a4CIP2B 554 


7X90 


870 


2656 


4442 


6228 


784CIP2B 555 


7191 


' 871 


2657 


4443 


6229 


784CIP2B 5^6 


7203 


872 


2"6T8 


4444 


6230 


7e4CIP2B 557 


7204 


873 


2659 


4445 


6231 


704CIP2B SS8 


7208 


874 


2660 


4446 


6232 




7209 


875 


2661 


4447 


6233 


784CIP2R *^60 


7210 


876 


2662 


4448 


6234 ' 




7216 


877 


2663 


4449 


623 S 




7221 


878 


2664 


4450 


6236 




7230 


879 


2665 


4451 


6237 




7237 


8815 


2666 


4452 


62^8 ~ 




7240 


881 - 


2667 


4453 


6239 




7245 


882 


2668 


4 454 


624 0 




7250 


883 




44SS 


6241 


784CIP2B 568 


7251 


884 


2670 


~~ — itS 






7255 


B85 


2671 




6243 


V84CIP2B 57Q 


7260 


686 


2672 


4458 


6244 


7B4CIP2B 571 


7265 


887 


2673 






784CIP2B 572 


7268 


688 


2674 


4 460 






7275 


889 


2675 


4461 


6247 


*7Q A/^TtJ*! O CIA 


7279 


890 


2676 


4462 


624Q 




7283 


8^1 


2677 


4 463 


6249 


784CIF2B 576 


7283 


892 


2676 


4464 


6250 




7287 


893 


2679 ■ 


4465 


6251 


/9H.\^±k'^ti 3 /cf 


73 01 


894 


2680 


4466 


6252 


7ft APT DOT* EC7Q 


7306 


89S 


2681 


4467 


62S3 




7308 


896 


2682 


4468 


6254 


7 ft d p T vo n ft T 


7309 


897 


26^3 


4469 


6255 


784PTP3tt "cap " 




898 


2684 


4470 


6256 


' 78JtPTP:>n'''<ift^ — 


WVH'a 


899 


2665 


4471 


6257 


7a4C:TP2B •534 

« *v V'Xlr^X? -J Tt 




900 


2686 


4472 


6258 


784CIP2B 585 


7326 


901 


2687 


4473 


6259 


784CIP2B 586 


733 4 


902 


2688 


4474 


6260 


784CIP2B. 587 


733 7 ~" 


903 


2689 


4475 


6261 


784CIP2B 586 


733 9 " *" 


904 


2690 


4476 


6262 


784CIP2B 589 


7344 


905 


2691 


4477 


•6263 


7a4CIP2B 590 


7355 


906 


i692 


4478 


^264 


784CIP2B 591 


7363 


907 


2693 


4479 


6265 


7a4CIP2B 592 


7363 


908 


.2694 


4480 


6266 


784CIP2B 593" 


7365 


909 


2695 


4481 


6267 


784CIP2B 594 


736 8 


910 


2696 


4482 


6268 


784CIP2B 595 


7369 


911 


2697 


4483 


6269 


784CIP2B 596 


7372 


912 


2696 


4484 


6270 


784CIP2B_S99 


7375 


913 


2699 


4485 


6271 


784CIP2B_60O 


7381 


914 


2700 


4486 


6272 


7e4CIP2B 601 


■ 7383 


9 IS 


2701 


4487 


6273 


784CIP2B 602 


7367 


916 


2702 


4488 


6274 


784CIP2B 603 


7391 


917 


2703 


44B9 


6275 


7S4CIP2B_604 


7393 


918 


2704 


4490 


6276 


784CIP2B 60S 


739S 


919 


2705 


4491 


6277 


784CIP2B 606 


7397 


920 


2706 


4492 


6278 


7B4CIP2B 607 


7399 


921 


2707 


4493 


6279 


784CIP2B 608 


7405 


922 


2708 


4494 


6280 


78ACIP2B 609 


7406 


923 


2709 


4495 


6261 


784CIP2B 610 


7406 


924 


2710 


4496 


6282 


784CIP2B 611 


7409 


925 


2711 


4497 


6283 


784CIP2B 612 


7410 


926 


2712 


4498 


6284 


784CIP2B 613 


7411 


927 


2713 


4499 


6285 


784CIP2B 614 


7417 
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PCT/US(M>/34263 



SEQ ID NO: 
Of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
Sequence 


S2Q ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket nuniber__ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U-S,S.N. 
09/488^725 


92 8 


2714 


4500 


6286 


784CIP2B_61S 


7418 


929 


2 715 


4501 


6287 


784CIP23 616 


7421 


93 0 


2716 


4502 


6288 


7S4CIP2B 617 


7422 


931 


2717 


4503 


6269 


7e4CIP23 618 


7422 


932 


2718 


4504 


6290 


784CIP2B 619 


7423 


93 3 


2719 


4505 


629X 


784CIP23 620 


7424 


934 


2720 


4506 


6292 


784CIP2a 621 


7426 


93 5 


272X 


4507 


6293 


784CIP23 622 


7427 


936 


2722 


4508 


6294 


784CIP23_623 


742B 


93 7 


2723 


4509 


6295 


784CIP23_624 


7430 


93 8 


2724 


4510 


6296 


784CIP2B 625 


7435 


93 9 


2725 


4511 


6297 


784CIP2B 626 


7437 


94 0 


2726 


4512 


6298 


784CIP2B_627 


7439 


941 


2727 


4513 


6299 


784CIP2B 628 


7440 


942 


2728 


4514 


6300 


784CIP23 629 


7442 


943 


2729 


4515 


6301 


784C1P2B 630 


7450 


944 


2730 


4516 


6302 


7a4CIP2B 631 


7451 


945 


2731 


4517 


6303 


784CIP2B 632 


7452 


946 


2732 


4518 


6304 


784CIP23 633 


7454 


94 7 


2733 


4519 


6305 


784CIP2B 634 


7457 


948 


2734 


4520 


6306 


7B4CIP2B 635 


7459 


949 


2735 


4521 


6307 


784CIP2B 636 


7461 


950 


2736 


4 522 


6308 


784CIP2B 637 


7463 


951 


2737 


4523 


6309 


784CIP2B 638 


7466 


952 


2738 


4524 


6310 


784CIP2B 639 


7469 


953 


2739 


4525 


6311 


784CIP23 64 0 


74 73 


954 


2740 


4526 


6312 


7B4CIP2B 641 


7481 


955 


2741 


4527 


6313 


784CIP2B 642 


7482 


956 


2742 


4526 


6314 


784CIP2B 643 


7482 


957 


2743 


4529 


6315 


784CIP2B 644 


7483 


958 


2744 


4530 


6316 


784CIP2B 645 


7485 


959 


2745 


4531 


6317 


7e4CIP2B 646 


74d6 


960 


2746 


4532 


6318 


784CIP2B_647 


7487 


361 


2747 


4533 


6319 


' 7S4CIP2B 648 


7491 


962 


2748 


4534 


6320 


784CIP23 649 


7492 


963 


2749 


4535 


6321 


784CIP2B_650 


7494 


964 


27S0 


4536 


6322 


7a4ClP23_651 


7498 


365 


2751 


4537 


6323 


784CIP2B 652 


7504 


966 


2752 


4538 


6324 


784CIP23 653 


7508 


967 
_ 


2753 


4539 


6325 


784CIP2B_654 


7516 




TTkH ~*" 


4540 


6326 


784CIP2B 655 


7518 




^ 1 33 


4541 


6327 


7a4GIP2B_656 


7519 


970 


^ / Do 


4542 


6328 


784CIP2B 657 


7521 


-* fJ, 


^ I 


4543 


6329 


784CIP23 658 


7529 


972 


2758 


4544 


6330 


784CIP2B 659 


7532 


973 


2759 


4S4S 


6331 


704CIP23 660 


7533 


974 


2760 




6332 


784CIP2B 661 


7535 


975 


2761 


454 7 


6333 


784CIP2B 662 


7545 


S7^; 


2762 


4548 


6334 


* w *■ v« JL XT ^ O O Q O 


/340 


977 


27S3 


4549 


6335 


784CIP2B 664 


7552 


978 


2764 


4SS0 


6336 


784CIP2B 665 


7554 


979 


27S5 


4551 


633 7 


784CIP2B 666 


7567 


980 


2766 """ 


45^2 


6338 


784CIP23" 667 


7569 


981 




4553 


633 9 


784CIP2B 668 


7575 


982 


2768 


4554 


6340 


784CIP23 663 


7576 


983 


2769 


45SS 


6341 


784C1P23_670 


1B77 


984 


2770 


4556 


6342 


784CIP2B 671 


7579 ■ 


965 


2771 


4557 


6343 


784C1P23 672 


7582 


986 


2772 


4558 


6344 


784CIP2B_673 


7S87 


987 


2773 


45S9 


6345 


7a4ClP23 674 


7589 


988 


2774 


4560 


6346 


784CIP2B 675 


7537 


989 


2775 


4561 


6347 


784CIP2B 676 


^597 
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SEQ ID NO: 
of full" 

length 

nucleotide 

sequence 


SEQ ID 
NO: of 

full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

Of contig 

peptide 

sequence 


Priority 
docltet number_ 
corresponding 
SEQ ID NO: in 
priority 
application 


S£Q ID 
NO: in 
U.S. S.N, 
09/488,725 


990 


2776 


4562 


6348 


784CIP2B 677 


7609 


991 


2777 


4563 


6349 


784CIP2B 678 


7609 


952 


2779 


4564 


6350 ' 784CIP2B 679 


7609 


393 


2779 


4565 


6351 


784CIP2B 680 


7613 


994 


2780 


4S6G 


6352 


784CIP23 681 


7623 


99S 


2781 


4567 


6353 


784CIP23 682 


7629 


996 


2732 


4568 


6354 


784CIP2B 683 


7630 


997 


2783 


4569 


^355 


784CIP2B 684 


: 7633 


998 


2784 


4570 


6356 


784CIP2B 685 j 7635 


999 


2785 


4571 


6357 




7638 


1000 


2786 ' 


4572 


6358 


7Hdr*TD*>li can 


7639 


1001 


2787 


4573 


6359 ■" 


7ft4r'Tt>oii caa 

fQA^X±r^JE> 000 


7646 


1002 


2788 


4574 


6360 


toH\^xtr£o bay 


7647 


1003 


2789 


4575 


6361 


7fl4.r'TP'>R Son 


7648 


1004 


2790 


4576 


6362 


7B4CIP9R fiOT 


7^58 


lOOS 


2791 


4577 


6363 




7664 


1006 


2792 


4578 


6364 




7664 


1007 


2793 


4579 


6365 




7674 * 


1008 


2794 


4580 


6366 


f 0 ?K v,,x Jr ^ 13 0 


7675 


1009 


279S 


4581 


6367 


784CIP5R fiO"? 


7676 


1010 


2796 


4582 


6368 


784CTP5R ^^oiT" ' " 
' V Y Va>x •IT id D Ojro 




10X1 


2797 


4 583 


6369 


784CIP2B 699 


"iieifift """" 


1012 


2798 


4584 


6370 


784CIP2B 700 


V693 


1013 


2799 


4585 


6371 


784CIP2B 701 " 


7694 


1014 


2800 


456e: 


6372 


784CIP2B 702 " 


7715 


1015 


2801 


4587 


6373 


784CIP2B 703 


7716 


1016 


2802 


4588 


■ 6374 


784CIP2B 704 


7718 


1017 


2803 


4589 


6375 


784CIPiB 705 


7721 


1018 


2804 


4590 


6376 ■ 


784CXP2B 706 


7723 


1019 


2665 - 


4S91 


6377 


784CIP2B 707 


7729 


1020 


2806 


4 592 


6378 


784CIP2B_708 


7733 


1021 


2807 


4593 


6379 


784CIP2B 709 " 


7735 


1022 


2606 


4594 


r 6380 


7a4CIP2B_710 


7741 


1023 


2809 


4595 


6381 


78401?-^! Vll " 


7743 


1024 


2810 


4596 


6382 


784CIP2B 712 


7748 


1025 


2811 


4597 


6383 


784CIP2B 713 


7749 


1026 


2812 


4598 


63B4 


784CIP2B 714 ' 


77S0 


1027 


2813 


4599 


6385 


784CIP2B 715 


7757 


1028 


2814 


4600 


6386 


784CXP2B 716 


7759 


1029 


281S 


4601 


6387 


784CIP2B 717 


7760 


1030 


2816 


4602 


63 88 


784CIP2B ilB 


77^0 


1031 


2817 


4603 


6389 


784CIP2B 719 


7764 


1032 


2818 


4604 


6390 


784CIP2B 720 


7765 


1033 


2819 


4605 


6391 


784CIP2B_721 


7766 


1034 


2820 


4606 


6392 


784CIP2B 722 


7767 


1035 


2821 


4607 


6393 


784CIP2B 723 


7769 


1036 


2822 


4608 


6394 


784CIP2B 724 


7770 


1037 


2823 


4609 


6395 


7a4CIP2B_725' ' 


7774 


1038 


2824 


4610 


6396 


784CIP2B 726 


7779 


103 9 


2825. 


4611 


63 97 


784CIP2B 727 


7781 


1040 


2826 


4612 


6398 


784CIP2a 728 


7782 


1041 


2827 


4613 


6399 


784CIP2B 729 


7783 


1042 


2828 


4614 


6400 


784CIP2B 730 " 


7787 


1043 


2829 


4^16 


6401 


784CIP2B 731 


7792 


1044 


2830 


4616 


6402 


784CIP2B 732 


7795 


104S 


2631 


4617 


6403 


784CIP2B 733 


7801 


1046 


2632 ■ 


4618 


6404 


784CIP2B 734 


7807 


1047 


2833 


4619 


6405 


784CIP2B_73S 


7808 


1048 


2834 


4620 " 


6406 


784CIP2a 736 


7819 


1049 


2835 " " 


4621 


6407 


784CIP2B 737 


7824 


1050 


2836 


4622 


6408 


784CIP2B 738 


7826 


LU&l 2337 


4623 


6409 


784CIP2B 739 


7829 -i 
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SEQ ID WO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full^ 
length 
peptide . 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO; 

of contig 

peptide 

sequence 


i?irioi*jL ty 

docVcet numH^»'y 

corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO : xn 
U-S . S , N . 


1052 


2838 


4624 


6410 


784CIP2B_740 


7832 


10S3 


2839 


4625 


6411 


784CIP2B 741 


7839 


1054 


2840 


4626 


6412 


784CIP2B 743 


7847 


10S5 


2841 


4627 


6413 


784CIP2B 744 


7848 


1056 


2S42 


4628 


6414 


784CIP2B 745 


7853 


1057 


2843 


4629 


6415 


784C1P2B 746 


7854 


10S8 


2844 


4630 


6416 


784CXP2B 747 


7856 


1059 


2845 


4631 


6417 


7S4CIP2B 748 


7862 


1060 


2846 


4632 


6418 


784CIP2B 749 


7865 


1061 


2847 


4633 


6419 


784CIP2B 750 


7874 


X062 


2848 


4634 


6420 


784CIP28 751 




1063 


2849 


4635 


6421 


784CIP2B 752 




1064 


2850 


4636 


6422 


784CIP2B 753 


7862 


1065 


28S1 


4637 


6423 


784CIP2D 754 " 




1066 


2852 


4638 


6424 


784CIP2B 755 




1067 


2853 


4639 


6425 


784CIP2fi 7Se 


I 

/ OOo 


1068 


2854 


4640 


6426 


784C1P2B 757 


— ^^^^ 


1069 


2855 


4641 


6427 




7901 


lOo 


2856 


4642 


6428 


784CI-P2R 'y^Q 


7910 


1071 


2857 


4643 


6429 


784CTP?R 7f?n 


7iJXl 


1072 


2858 


4644 


6430 




"n'a'^ n 


1073 


2859 


" 4645 


6431 


7S4CIP5R 'JRO 


7923 


1074 


2860 


4646 


6432 


' OTt V. J. It *Xj / O J 


7924 


1075 


2861 


4647 


6433 




792S 


1076 


2862 


4648 


6434 




^"S^'q 


1077 


2863 


4649 


6435 


784CIP2& 766 


/ 


1078 


2864 


4650 


6436 


784CIP2B 767 — 


7930 


1079 


2865 


4651 


6437 


784CIP2B 7gg"""" 


793 4 


1080 


2366 


4652 


6438 


784CIP2B 769 " 




1081 


2367 


4653 


6439 


784CIP2B 770 




1082 


" 2868 


4654 


6440 


784CIP2B 771 


7945 


10B3 


2869 


4655 


6441 


784CIP2B 772 


7946 


1094 


2870 


4656 


6442 


784CIP2B 773 


7948 


1065 


2871 


4657 


6443 


'7a4eiP2B 774 


79S1 


1086 


2 872 


4658 


6444 


7 84CIP2B 775 


7952 


1087 


2873 


4659 


6445 


7a4CIP2B 776 


7953 


1088 


2874 


4660 


6446 


7e4CIP2B 777 


7954 


1089 


2875 


4561 


6447 


784C1P2B 778 


7957 


1090 


2876 


4662 


544S 


784CIP2B 779" 


7958 


1091 " 


2877 


4663 


6449 


784CIP2B 780 


7961 


1092 


2878 


4664 


6450 


784CIP2B 781 


7965 


1093 


2879 


4655 


6451 


784CIP2B 782 


7966 


1094 


2880 


4666 


6452 


784CrP2B 783 


7979 


1095 


2881 


4667 


6453 


784CIP2B 784 


7986 


109^ 


2882 


4668 


6454 


784CIP2B 785 


7986 


1097 


2883 


4669 


6455 


784CIP2B 786 


7988 


1098 


2884 


4670 


6456 


784CIP2B 787 


7991 


1099 


2885 


4671 


6457 


784CIP2B 788 


7992 


1100 


2886 


4672 


6458 


784CIP2B 789 


7992 


1101 


2QQ7 


4673 


6459 


784CIP2B 790 


7992 


1102 


2888 


4674 


6450 


784CXP2B 791 


7992 


1103 


2889 


4 675 


6461 


784C1P2B 792 


8003 


1104 


2890 


4676 


6462 


784CIP2B 793 


8014 


llOS 


2891 


4677 


6463 


784CIP2B 794 


8015 


1106 


2892 


4678 


6464 


784CIP2B 795 


8016 


1107 


2893 '■ * 


4679 


6465 


7a4CIP2B 796 


8017 


1108 


.2894 


4680 


6466 


784CIP2B_797 


8019 


1109 


2895 


4681 


6467 


784CIP2B 798 


8020 


1110 


2896 


4682 


6463 


7a4CIP2B 799 


8022 


1111 


2897 


4683 


6469 


784CrP2B 800 


8022 


1112 


2898 


4684 


6470 


V84CIP2B 801 


8028 


1113 


2899 


468S 


6471 


784CIP2B 802 


8030 



28S 
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SEQ ID NO: 

Of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ XD NO: 
of contig 
nucleotide 
sequence 


SBQ ID 
NO: 

of contig 

peptide 

sequence 


Priojri ty 
docket number 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
CJ.S .S .N. 
09/488,725 


1114 


2900 


4686 


6472 


784CIP2B 803 


8038 


Ills 


2901 


4687 


6473 


784CIP2B 804 


8042 


1116 


2902 


4688 


6474 


784CIP2B 805 


8045 


1117 


2903 


4689 


6475 


784CIP2B 80^ 


8045 


1118 


2904 


4690 


6476 


784CIP2B 807 


8046 


1119 


2905 


4691 


6477 


784CIP2B 808 


8047 


1120 


2906 


4692 


6478 


784CIP2B 809 ^ 


8051 


1121 


2907 


4693 


6479 


784CIP2B 810 


8059 


1122 


2908 


4694 


6480 


784CIP2B 811 


8064 


1123 


2909 


4695 


6481 


784CIP2B 812 


8069 


1124 


2910 


4696 


6482 


784CIP2B 813 ' 


8074 


112S 


2911 


4697 


6483 


784CIP2B 814 ' 


8077 


1126 


2912 


4698 


6484 


784CIP2B 815 


8078 


1127 


2913 


4 699 


6465 


784CIP2B 816 


80719 


1128 


2914 


4700 


6486 


784CIP2B 817 


; 8084 


1129 


2915 


4 701 


6487 


784CIP26 818 " 


• 8088 


1130 


291^ 


4702 


6488 


784CIP2B 819 


1 6096 


1131 


2917 


4703 


6489 


784CIP2B 820 


6091 


1132 


2918 


4704 


6490 


784CIP2B 821 


is 099 


1133 


2919 


4705 


6491 


784CIP2B 822 " 


86^9 


1134 


2920 


4706 


^492 


784CIP2B 823 " 


81 '6 b 


1135 


2921 


4707 


6493 


784CIP2B 824 


8102 


1136 


2922 


4708 


6494 


784CIP2B 825 


8103 


1137 


2923 


4709 


6495 


784CIP2B 826 


8103 


1138 


2924 


4 710 


6496 


784CIP2B 857 


8104 


1139 


292S 


4^11 


6497 


784CIP2B 828 


8108 


1140 


2926 


4712 


6498 


784CIP2B 829 


8110 


1141 


2927 


4713 


6499 


784CIP2B 830 


8116 


1142 


2928 


4714 


6500 


784CIP2B 831 


8117 


1143 


2929 


4 715 


5501 


" 784CXP2B_832 ^ 


8123 


1144 


2930 


4716 


ssoa 


784CIP2B 833 


8130 


1145 


2931 


4 717 


6503 


784CIP2B_834 


8130 


1146 


2932 '" 


4718 


6504 


784CIP2B 835 


8i4S 


1147 


2933 


4719 


6505 


784CXP2fe 836 


8143 


114B 


2934 


4720 


6506 


784CIP2B 837 


8154 


1149 


2935 


4721 


6507 


784CIP2B 838 


815S 


1150 


2936 


4722 


6508 


784CIP2B^839 


8162 


1151 


. 2937 


4723 


6509 


ieAtZP2B 840 


61^3 


1152 


2938 


4724 


6^10 


784CIP2B 841 


8172 


1153 


2939 


4725 


6511 


784CIP2B 842 


8173 


11S4 


2940 


4726 


6512 


784CIPiB 843 


BXi$ 


1155 


2941 


4727 


6513 


784CIP2B 844 


6182 


1156 


2942 


4728 


6514 


784CIP2B 845 


8183 


1157 


2943 


4729 


6S1S 


784Crp2B 846 


8184 


1158 


2944 


4730 


6516 


784CIP2B 847 


8185 


1159 


294 5 


4731 


6517 


784CXP2B 848 


8187 


1160 


2946 


4732 


6518 


784CIP2B 849 


8168 


.^^^^ 


2947 


4733 


6519 


784CIP2B 850 


6190 


1162 
Ii63 


2948 


4734 


6520 


784CIP2B 851 


8190 




2S49 


4735 


6521 


784CIP2B 852 


8192 


1164 


2950 


4736 


6522 


784CXP2B 853 


8193 


1165 


2951 


4737 


6S23 


784CIP2B 854 


8197 


1166 


2952 


4738 


6S24 


784CIP2B_8S5 


8197 


1167 


2953 


4739 


6525 


784CIP2B 856 


8199 


1168 


2354 


4740 


6S26 


784CIP2B 857 


8202 


1169 


29SS 


4741 


6S27 


7e4CIP2B 858 


8203 


1170 


2956 


4742 


6528 


784CXP2B 859 


8208 


1171 


2957 


4743 


. 6529 


784CIP2B 860 


8209 


1172 


2958 


4744 


6530 


784CIP2B 861 


8211 


1173 


29S9 


4745 


6531 


784CIP2B 862 


8214 


1174 


2960 


4746 


6532 


784CIP2B 863 


8217 




4747 


6533 


784CIP2B 864 


8223 



tNSOOCID; <WO 015331 2At_L> 
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of full- 
length 
nucleotide 
sequence 


SEQ ZD 
NO : o£ 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
oc contig 
nucleot ide 

51om 1 AVI ^ ^ 


SEQ ID 
KG: 

of contig 
peptide 

Q UC H\, tS 


Priority 
docket number^ 
corresponding 
SEQ ID NO: in 
priority 
&ppl ication 


SBQ ID 
NO: in 

09/488,725 


1176 


.2962 


474i6 


6534 


/ oHL.J.r'^r} obis 


8224 


1177 


2963 


4749 


653S ' ' 


/ O'tUXir^O ODD 


8226 • 


1178 


2964 


4750 


6536 


/ O'^C-XJrZo ob / 


8227 


1179 


2965 


4751 


6537 


f tj'k\^±irZo ODD 


8229 


1180 


2966 


4752 


6538 




8232 


1181 


2967 


4753 


6539 


/04CXP2B 870 


8236 


1182 


2968 


4754 


6540 


/04L.XP2B 871 


8239 


1103 


2969 


4755 




784CIP2B 872 


8244 


1184 


2970 


4756 


6542 


784CIP2B 873 


8245 


118S 


2971 


4757 




784CIP2B 874 


8248 


1186 


2972 


* / JO 


6544 


784CIP2B 875 


8251 


1167 


2973 


4 759 


654 5 


784CIP2B 876 


8253 


1188 


2974 


4760 


6546 


784CIF2B 877 


8260 


1189 


2975 


4761 


6547 


784CIP2B 878 


6i62 ~" 


1196 -■ 


2976 


4762 


6548 


784C1P2B 879 


8268 


1191 


2977 


4763 


6549 


784CIP2B 880 


8270 


1192 


2978 


4764 


6550 


784CIP2B 861 


8272 


1193 


2979 


4765 


6551 


784CIP2B 882 


82^4 


1194 


2980 


4766 


5552 


784CIP2B 883 


"■ 8274 


1195 





4767 


SSS3 


7e4CIP2B 884 


8275 ~ 


1196 


2982 


4768 


6554 


784CIP2B 885 


8277 


1197 




4769 


6555 


784CIP2B 886 


8281 


1198 


2984 


4770 


6556 


784CIP2B 887 


8283 


1199 


296S 


4771 


6557 


7a4CIP2B 888 


8289 


1200 


2986 


4772 


6558 


784CIP2B 889 


8295 


1201 


<c£^0 / 


4773 


6559 


784CIP2B_e90 


8300 


12 02 


2988 


4774 


6S60 


784C1P2B 891 


83bi 


1203 


2989 


477S 


6561 


784CIP2B 892 


8304 


1204 


2990 


4776 


6562 


784CIP2B 893 


6305 


1205 


2391 


4777 


6363 


784CIP2B 894 


8309 


120G 


2992 


4 7 7 B 


6564 


784CIP2B 895 


8318 


1207 


2993 


4779 


6565 


784CIP2B 696 


8319 


1208 


2994 


4780 


6566 


7 8 4 CI P2 B_8 9 7 


8321 


1209 


299S 


4781 


6567 


784CIP2B 898 


8322 


1210 


2996 


4782 


Ct>DO 


784CIP2B 899 


8323 


1211 


2997 


4783 


656*9 


/04CXP2B 900 


8325 


1212 


2998 


4784 


6S70 


* aHr^AXt'jio 901 


8331 


1213 


2999 


4785 


6571 




8332 


1214 


3000 


4786 


6572 




8333 


12 IS 


3001 


4787 


6573 




U335 


I5l6 


3002 


478^ 


6574 




8336 


1217 


3O03 


4789 


6575 


7ft4r*TP'?'R <inc '" " 


8337 


1218 


3004 


4790 


6576 


784CTP2I1 907 


8340 


1219 


3005 


4791 


6577 


784CIP2B 90^ ~ 


8 343 


li20 


3006 


4792 


6578 


784CIP2B 909 


8347 


1221 


3007 


4793 


6S79 


784CIP2B 910 




1222 


3008 


4794 


65B0 


784CIP2B 911' 




1223 


3009 


4795 


6581 


784r*TP9¥l <ilD 


8353 


1224 


3010 


4796 


6582 


784CIP2B 913 


8355 


1225 


3011 


4797 


6583 


7B4CIP2B 914 


8361 


1226 


3012 


4798 


6584 


784CIP2B 915 


8365 


1227 


3013 


4799 


6585 


7B4CIP2B 916 " 


8367 


1228 


3014 


480O 


6586 


784CIP2B 917 


8369 


1229 


3015 


4801 


6587 


784CIP2B 919 


8375 


1230 


3016 


4802 


6588 


784CIP2B_920 


8307 


1231 


3017 


4803 


• 6589 


784CIP2B 921 


8391 


12^2 


3018 


4804 


6590 


78^CIP2B 922 


8393 


1233 


3019 


4805 


6591 


784CIP2B 923 


8393 


1234 


3020 


4806 


6592 


7a4CIP2B 924 


8394 


1235 


3021 


4807 


6593 


784CIP2B 925 


8395 


1236 


3022 


4808 


6S94 


784CIP2B 926 


8396 


1237 


3 023 • 


4809 


6595 


784CXP2B_927 


8398 



290 



A 

3NSDCX;iD; <WO 01533iaAl^L> 
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PCT/USOO/34263 



SEQ ID NO: 
o£ full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: Of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


j t J. juji. J. V-y 

docket fumbec* 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 

XJ,s ,S .N. 
09/488 7?*^ 


1238 


3024 


4810 


6596 


784CIP2B 928 


8402 


1239 


3 025 


4811 


6597 


784CIP2B 929 


8402 


1240 


3026 


4812 


6598 


784CIP2B 930 


8405 


1241 


3027 


4813 


6599 


784CIP2B 931 


8406 


1242 


3028 


4814 


6600 


784CIP2B 932 


8409 


1243 


3 029 


4815 


6601 


784CIP2B 933 


8410 


1244 


3030 


4816 


€602 


784CIP2B 934 


8414 - 


1245 


3031 


4817 


6603 


784CIP2B 935 


8415 


1246 


3032 


4818 


6604 


784CIP2B 936 


8419 


1247 


3033 


4 819 


seos 


784CIP2B 937 


8426 


1248 


3034 


4820 


6606 


7 84CIPiB__938 


8430 


1249 


3035 


4821 


6607 


7e4CIP2B 939 


8431 


12S0 


3036 


4822 


6608 


784CIP2B 940 


8432 


1251 


3037 


4 823 


6609 


784CIP2B 941 


84^3 


12S2 


3038 


4824 


6610 


784CIP2B 942 




12S3 


3039 


4825 


6611 


784CIP2B 943 


843 si 


1254 
1255 


3040 
3041 


4836 
4827 


6612 
6613 


7e4CIP2B 944 
784CIP2B 945"* 


8439 


1256 


3042 


4828 


6614 


784CIP2B 946"" " 




1257 


3043 " 


4629 


6615 


784CXP2B 94 7 " 


8451 


1258 


3044 


4830 


6616 


784CIP>B 94 fl 


"a I CO 


1259 


3045 


4831 


6617 




8460 


1260 


3046 


4 832 


6618 


784CIP2B 950 


8461 " "" 


1261 


3047 


4833 


6619 


784CIP2B 951 


J .. — - 


1262 


3048 


4834 " 


6620 


784CIP2B 952 


8464 


1263 


3049 


4835 


6621 


784CIP2B 953 


8465 


1264 


3050 


4836 


6622 


784CIP2B 954 


8467 


1265 


3051 


4 837 


662i 


784Cl!:^2B 955 


8470 


1266 


3052 


4836 


6624 


7e4CIP2B 956 


fX 


1267 


3053 


4839 


6625 


784CIP20 957 


8473 "~" 


1268 


3054 


4840 


6626 


784CIP2B 958 


8474 


1269 


3055 


4841 


6627 


764t:iP26 959 ■ 


8475 


1270 


i3656 


4842 


■ ^628 


7e4CIP2B 960 '"* 


8476 


1271 


3057 


4843 


6629 


784CIP2B 961 


8480 


1272 


3058 


4844 


6630 


784CIP2B 962 


8482 


1273 


3059 


4845 


6S31 


784CIP2B 963 


8462 


1274 


3060 


4846 


6632 - 


784CIP2B 9^4 


8486 


1275 


So^i 


4847 


6633 


784CIP2B 965 


8488 


1276 


3062 


4648 


6634 


784CIP2B 966 


8492 


1277 


3063 


4849 


6635 


784CrP2B 967 ' 


6494 


1278 


3064 


4850 


£l^36 


•)64(ilt»2B 968 ■ 


84^^ 


1279 


3065 


4851 


6637 


784CIP2B 969 


8497 


1280 . 


3066 


4852 


6638 


784CXP2B 970 


8499 


1261 


3067 


4853 


6639 


784CIP2B 971 ' 


8513 


1282 


3068 


4854 


6640 


784CIP2B 972 


8522 


1283 


3069 


4B55 


6641 


784CIP2B 973 


8526 


1284 


3070 


4856 


6642 


784CIP2B 974 


8531 


1285 


3071 


4857 


6643 


7e4CIP2B 975 


8533 


1286 


3072 


4858 


6644 ■ ~ 


784CIP2B 976 


8542 


1287 






6645 


7a4CIP2B 977 


8544 


1288 


3074 


4860 


6646 


784CIP2B 978 


8565 


1289 


3075 


4861 


6647 


784CIP2B 979 


8565 


1290 


3076 


4862 


6648 


784CIP2B 980 


8572 


1291 


3077 ' 


4863 


6649 


784CIP2B 981 


8576 


1292 


3078 


4864 ' 


6650 


7e4CIP2B 982 


8578 


1293 


3079 


4865 


6651 


784CIP2B 983 


6584 


1294 


3080 


4866 


6652 


784CXP:2B 984 


^598 


129S 


3081 


4867 " " 


6653 


784CIP2B 985 


8602 


1296 


3082 


4868 


6654 


784CIP2B 986 


8604 


1297 


3083 


4869 


6655 


784CrP2B 987 


8609 


1298 


3084 


4 870 


66S6 " 


784CIP2B 988 


6612 


1299 


3085 


4871 


6657 


784CIP2B 989 


8637 



4' 
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SEQ ID NO: 
Of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO; 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority- 
docket number^ 
cor re spondi ng 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


1300 


3086 


4872 


6658 


704CIP2B_990 


8640 


1301 


3087 


4873 


6659 


7a4CIP2B„991 


8643 


1302 


3088 


4874 


6660 


784CXP2B 992 


8645 


1303 


3089 


487S 


6661 


784GIP2B 993 


8650 


1304 


3090 


4876 


6662 


784CIP2B 994 


8651 


1305 


3091 


4877 


6663 


784CIP2B 995 


8654 


1306 


3092 


4878 


6664 


784CIP2B 996 


86 SS 


1307 


3093 


4879 


6665 


784CIP2B 997 


8657 


1308 


3094 


4880 


6666 


7e4CIP2B 998 


866S 


1309 


3095 


4881 


6667 


784CIP2B 999 


8668 


1310 


3096 


4882 


6668 


784CIP2B 1000 


8671 


1311 


3097 


4883 


^^69 


784CIP2B 1001 


8672 


1312 


3098 


48B4 


6670 


784CIP2B 1002 


8692 


1313 


3099 


48B5 


6671 


784CIP23_1003 


87C6 


1314 


3100 


4886 


6672 


784CIP23 1004 


8716 


1315 


3101 


4887 


6673 


784CIP2B 1005 


8719 


1316 


3102 


488S 


6674 


784CIP2B 1006 


8743 


1317 


3103 


4889 


6675 


784CIP2B 1007 


8764 ■ 


1318 


3104 


4890 


6676 


784CIP2B 1008 


87^4 


1319 


3105 


4891 


6S77 


784CIP2B 1009 


8764 


1320 


3106 


4892 


6678 


7e4CIP2B 1010 


8774 


1321 


3107 


4893 


6679 


784CIP2B 1011 


8782 


1322 


3108 


4894 


6680 


784CIP2B 1012 


8796 


1323 


3109 


4895 


6681 


764CIP2B 1013 


8827 


1324 


3110 


4896 


6682 


7B4CIP2B 1014 


8842 


132S 


3111 


4897 


6683 


784C1P2B 1015 


8842 


1326 


3112 


4898 


6684 


784CIP2B 1016 


8858 


1327 


3113 


4899 


6685 


784CIP2B l6l7 


8 871 


1328 


3114 


4900 


668^ 


784CIP2B 1018 


8921 


1329 


3115 


4901 


6687 


784CIP2B_1019 


8927 


1330 


3116 


4902 


6688 


784CIP2B 1020 


8942 


1331 


3117 


4903 


6689 


784CIP2B_1021 


dd94 


1332 


3118 


4904 


6690 


784C1P2B 1022 


9023 


1333 


3119 


4905 


6691 


784CIP2B_1023 


9028 


1334 


3120 


4906 


6692 


784CIP2B 1024 


9058 


133S 


3121 


4907 


6693 


784CIP2B„1025 


9058 


1336 


3122 


4908 


6694 


784CIP2B 1026 


9079 


1337 


3123 


4909 


6695 


784CIP2B 1027 


9079 


1338 


3124 


4910 


6696 


784CIP2B_1028 


9082 


1339 


3125 


4911 


6697 


784CIP2B_1029 


9084" 


1340 


3126 


4912 


6698 


784CIP2B_1030 


9093 


1341 


3127 


4913 


6699 


784CIP2B 1031 


9101 


1342 


3128 


4914 


6700 


784CIP2B_1032 


9103 


1343 


3129 


4915 


6701 


784CIP2B_1033 


9105 


1344 


3130 


4916 


6702 


784CXP2B 1034 


91S1 


134S 


3131 


4917 


6703 


784CIP2B_1035 


9161 


1346 


3132 


4918 


6704 


784CIP2B_1036 


9172 


1347 


3133 


4919 


6705 


784CIP2B_1037 


9174 


1348 


3134 


4920 


6706 


784CIP2B_1038 


9204 


1349 


3135 


4921 


6707 


784CIP2B 1039 


9234" 


1350 


3136 


4922 


6708 


7e4CIP2B_1040 


9235 


1351 


3137 


4923 


6709 


784CIP2B_1041 


9239 


1352 


3138 


4924 


6710 


784CIP2B_1042 


9256 


1353 


3139 


4925 


6711 


784CIP2B_1043 


9276 


1354 


3140 


4926 


6712 


784CIP2B_1044 


9345 


13S5 


3141 


4927 


6713 


784CIP2B_104S 


9379 


1356 


3142 


4928 


6714 


784CIP2B_1046 


9435 


1357 


3143 


4929 


6715 


784CIP2B_1047 


9437 


1358 


3144 


4930 


6716 


784CIP2B 1048 


9469 


1359 


3145 


4931 


6717 


784CIP2B_1049 


9500 


1360 


3146 


4932 


6718 


784CIP2B__1050 


9502 


1361 


3147 


4933 


6719 


784CIP2B 1051 


9S20 
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SBQ ID NO: 
of full- 
length 
nucleotide 
sequence 



1362 



13 63 
1364 
1365 



SEQ 10 
NO: of 
full- 
length 
peptide 
sequence 



3148 



_ 314S 
3150 



SEQ ID UOz 
of conticf 
nucleotide 
sequence 



493S 



SEQ ID 
KO; 
of contig 
peptide 
sequence 



6720 



6721 



Priority 
docket nuinber_ 
corresponding" 
SEQ ID NO: in 
prioirity 
application 
V84CIPaB 1052" 



784C1P2B 1053" 



7e4CII>2B 1054" 



SEQ ID 
NO : in 
U.S. S.N. 
0S/48S, 725 



9541 



954X 



1366' 



1367 



1366 



1369 



i3-?i 

1372 



1373 



1374 



137S 
1376 



1377 



1378 



1379 



1381 
13 82 



1384 



138S 



1386 



1389 



1393 
1394 



31S2" 



6723 



4938 



3153 



6724 



784CIP2B^ 
784CIP2b" 



1055 



4939 



31S4 

'3 155" 



6725 



1056 



4940 



V84CIP2B 1057' 



6726 



"3156~ 



3157 



3158 



3159 
3160 
3161 



6727 



784CIP2B 1058' 



494 2 
4943 



784CIP2B 



6728 



784CIP2B 



4944 



3162 
3163" 



3164 



315!r 



3166 



3167 



3168 
"3 1^9" 



3170 
3171^ 



3172 
317T" 



3174 



3175 



3175 
3177 



3178 



3179 
"3180 



4945 
4946 
4947 



6729 



1059 
1060 



784CIP2B 1061^ 



6730 



784CIP2B 1062 



6731 
67i2 



4948 



4949 
4950 



4951 



'4953 



49SS 



4956 



4957 



4958 
4959 
4960 



4961 



4962 
4963 



4964 

496r 

4966 



6733 



6734 



784CIP2B_1063" 
1064" 

Togs" 



'"784CIP2B 



784CIP2B 



V84CXP2B 1066 



6735 



6736 



7S4CIP2B 1067' 



6737 



784CIP2B 1068" 



6738 



784CIP2B 1069 



784CIP2B 1070 



6739 



6740 



784CIP2B 1071' 



■'784ClP2fe 107i" 



6741 



6742 



7S4CIP2B 1073 



6743 



784CIP2B 1074" 



6744 



6745 



7a4CIP2B, 
784CIP2B' 
784CrP2B" 



1076" 



6746 



1077 



7a4ClP2B 



6747 



704CIP2B 



674 8 



784CIP2B 



1078 
1079 



6749" 



6750 



_T84Clp2g 
^84CIP2b" 



1082 

784CIP2B_1083"" 



9556 



9556 



9575 



9589 



9602 



9606 



9622 



9623 



9646 



9747 



9773 



9785 



9801 



9811 



9854 



9854 



9864 
9864 



9871 



9881 



9885 



9901 
9912" 



"9915" 



9921 



"1396- 



1397 



1398 



1399 



1400 
1401 



14 03 



~14 0S 
1406 



1407 



1408 



1409 
1410 



1411 
1412 



1414 
1415 



1417 



1418 



1419 
1420 



1421 
1422 



1423 



3182"' 



3184 



3185 



3186 
3187" 



3188 



3189 



3190 



3191 



3192 
3193 



3194 



3195 



3196 



3197 



4967 
4968 



6753 



6754 



7a4CiP2B_10e4~ 
"784CIP2B 1085" 



4969 
4970 



67S5 



784CIP2B 1086' 



6756 



7B4CIP2B 1087 



4971 



4972 



6757 



784CIP2B ,1088 



4973 



6758" 



784CIP2B 1089 



6759 



~V84CIP2B 10^0' 



4974 



6760 



4975 



6761 



784CIP2B 109'i " 
784CIP2b"i092 ■ 



4976 



"T762* 



~784CIP2B~'l094 



4977 



6763 



784CIP2B 1095" 



4978 



6764 



784CIP2B 1096 



4979 

4980" 



676S 



784CIP2B 1097 



6766 



784CIP2B_Z098'' 



4981 



6767 



7a4CIP2B 109^" 



4982 



784CIP2B 1100 



3198 



3199 



3201 
3202 



3203 



3204 



3205 



3206 



3207 



3208 



3209 



6768 



4983 



6769 



7a4CIP2B 1101 



4984 



4985" 



6770 
6771 



_7S4CIP2B1102 
784CIP2B*~1103 ■ 



4986 



6772 



784CIP2B 1104 



4987 



784CIP2C 1 



6773 



4988 



6774 



4989 



6775 



4990 



6776 



4991 



6777 



6778 



4993 



6779 ■ 



4994 
4995 



6780 



6781 



704CrP2C 2 



_784CIP2C3 
784CIP2C 4 



7a4CIP2C_5 
784CIP2C 6 



784CrP2C 7 



784CIP2C 8 



784CIP2C 9 



7a4CIP2C 10 



~9930 
9949 



9951 



9973 



9982 



9994 



X0021 
10041 



10067 



10073 



10112 



10117 



10132 



10169 



10217 



10226 



10232 



10237 



10279 



IT 



271 



848 



849 



864 



953 



980 



1595 



1697 



1744 
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SEQ ID NO: 
o£ full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 

full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO : 

of contig 

peptide 

sequence 


Priority 

docJcet nuin]t>er 

corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO : in 
U S S.N 
09/488 72S 


' 1424 


3210 


4996 


6782 


784CIP2C 11 


1937 


142S 


3211 


4997 


6783 


784CIP2C 12 


1955 


1426 


3212 


4998 


6784 


784CIP2C 13 


1955 


1427 


3213 


4999 


S785 


784CIP2C 14 


2185 


1428 


3214 


5000 


6786 


784CIP2C 15 " 


2889 


1429 


3215 


5001 


6787 


784CIP2C 16 


2901 


1430 


3216 


5002 


6788 


784CIP2C 17 


2902 


1431 


3217 


5003 


6789 


784CIP2C 18 


2905 


1432 


3il8 


5004 


6790 


784CIP2C 19 


2948 


1433 


3219 " 


5005 


6791 


784CIP2C 20 


2956 


1434 


3220 


5006 


6792 


784CiP2C 21 




143S 


3221 


5007 


6793 


784(i:lP2C 29 


T^^'Jc't"'" """" 


1436 


3222 


5008 


6794 


7S4CIP2C 23 


2966 


1437 


3223 


5009 


679S 


784CIP2C 24 


*t.970 


143d 


3224 


SOlO 


6796 


VfldPTP^r* 9t;'""' 


2S85 


1439 


3225 


5011 


6797 


7B4CXP2r' 


2987 


1440 


3226 


5012 


6798 




2953 


1441 


3227 


5013 


6799 




2993 


1442 


3228 


5014 


6800 




3017 


1443 


3229 


5015 


6801 




3046 


1444 


3230 


'■" "'5'bi'6'"' 


6802 


/ OH\^±k'4\^ JJL 


3050 


144^ 


3231 


5017 


6803 




3357 


1446 


3232 


5018 


6804 




3359 


144 7 


3233 


5019 


6805 




3432 


1446 


3234 


5020 


6806 


/ O ft w •!> JT « <3I> 


343 8 


1449 


3235 


5021 


6807 


/ouV^Xir^L. Jib 


343 9 


14S0 


3236 


5022 


6808 




3463 


14^1 


3237 


5023 


6809 






1452 


3238 


5024 


6310 


784C1P5P AT 




14S3 


3i39 


5025 


€311 


784CIP2C 42 


3467 


14S4 


3240 


5026 


6 312 


7e4CIP2C 43""' " 


3466 ""' 


1455 


3241 


S027 


6813 


7S4CiP2C 44 




1456' 


3242 


5028 


6814 


784CIP2C 45 


3484 


1457 


3243 


5029 


6815 


784CIP2C 4^ 


3486 


1458 


3244 


5030 


6816 


784CIP2C 47 


3t491 


1459 


3245 


5031 


6817 


7a4CIP2C 48 


3493 


1460 


3246 


5032 


6818 


784CIP2C 49 


3494 


1461 


3247 


5033 


6819 


784CZP2C' 50 


3495 


1462 


3248 


5034 


6820 


784CIP2C 51 


3496 


1463 


3249 


5035 


6821 


784CIP2C 52 


3503 


1464 


3250 


5036 


6822 


784CIP2C S3 


3503 


1465 


3251 


5037 


■ 6823 


784CIP2C 54 


3504 


1466 


32B2 


5038 


6824 


784CIP2C 55 


3511 


1467 


3253 


5039 


6825 


784CIP2C 5,6 


3531 


1468 


3254 


5040 


6826 


784CIP2C 57 


3536 


1469 


3255 


5041 


6 82 7 


784CIP2C 5d 


3546 


1470 


32S6 


5042 


6828 


784CIP2C 59 


3548 


1471 


3257 


5043 


6829 


784CIP2C 60 


3551 


1472 


3258 " 


5044 


6830 


7e4CIP2C 61" 


3553 


1473 


3259 


5045 


6831 


7a4CIPiC 6i 


3564 


1474 


3260 


S046 


6832 


784CIP2C 63 


3567 


1475 


3261 


S047 


6033 


784CIP2C 64 


3572 


1476 


3262 


5048 


6834 


784CIP2C 65 '■' 


3573 


1477 


3263 


5049 


6835 


784CIP2C 66 


3574 


147B 


3264 


5050 


6836 


784CIP2C ^7" 


3583 


1479 


3265 


5051 


6837 


784CIP2C 66 


3615 


1480 


3266 


50S2 


6838 


784CIP2C 69 


3623 


1481 


3267 


5053 


6839 


784CIP2C_76 


3 629 


1432 


3268 


5054 


6840 


784CXP2C 71 


3666 


1483 


3269 


BOSS 


6841 


784CIP2C 72 


3667 


1484 


3270 


5056 


6642 


7a4CrP2C 73 


3906 


1485 


3271 


5057 


6843 


784CIP2C 74 


3912 
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SEQ ID NO: 
Of full- 


SEQ ID 
NO: of 


of conticf 


SEQ ID 
NO: 


Piriori ty 
doclcet nuitUDSjT 


SEQ ID 
NO : in 


length 
nucleotide 


full- 
length 


nucleotide 
sequence 


of con tig 
peptide 


CO irire sponding 
SEQ ID NO: in 


U • S • S N . 
09 /488 # 725 


seqpjence 


peptide 
sequence 




sequence 


priority 
application 




1486 


3272 


5058 


6844 


764CIP2C_7S 


3924 


1487 


3273 


5059 


6845 


784CIP2C 76 


3928 


148B 


3274 


5060 


6846 


784CIP2C 77 


3935 


1489 


3275 


5061 


6847 


784CIP2C 78 


3359 


1490 


3276 


5062 


6648 


784CIP2C__79 


3981 


1491 


3277 


5063 


6849 


7B4CIP2C 80 


3989 


1492 


327S 


5064 


6850 


784CIP2C 81 


4295 


1493 


3279 


5065 


6851 


784CIP2C 82 


4300 


1494 


3280 


5066 


6852 


784CIP2C 83 


4360 


14 95 


3261 


5067 


6853 


784CIP2C 84 


4362 


1496 


3282 


5068 


6854 


784CJP2C 85 


4371 


■ 1497 


3283 


5069 


6855 


784CIP2C 86 


43 73 


1498 


3284 


5070 


6856 


7fl4r'TP7r 87 


sio /o 


1499 


3285 


5071 


6857 


7i84CIP2f!! 89 




1500 


3286 


5072 


6658 






1501 


3287 


5073 


6859 


7a4CIP2C! 91 


4409 


1502 


3288 


5074 


5550 




4421 


1503 


3289 


■ sb'7S 


6861 




4421 


1504 


3290 


' — "sb7"6 ■ 


6862 


/ D 9 L> a JK ^ L. _ 


4426 


1S05 


3291 


5077 


6363 




4430 


1506 


3292 


5078 


6864 




443 5 


1507 


3293 


5079 


6865 


/ ci*s\^x Jr 3/ 


_ 

4* J o 


1508 


3294 


5080 


6866 




4439 


1S09 


3295 


50B1 


6867 




4440 


ISIO 


3296 


5082 


6B68 


i Ot\mJ.tr^\^ J.UU 


4441 


1511 


3297 


5083 


6869 




4442 


1512 


3298 " 


5084 


68^0 




4455 


1513 


3299 


5085 


^6971 




4462 


1514 


3300 


5086 


6872 


784CIP2C 104 


4466 


1515 


3301 


5087 


6873 


784CIPPr' lot; 




1516 


3302 


5088 


6374 


*i^84CIP2C 106 


4477 


1S17 


3303 


5089 


6875 


784CIP2C 107 


4481 


1518 


3304 


5090 


6376 


784CIP2C 108 


4483 


1519 


3305 


5091 


6377 


784CIP2C 109 


4484 


1520 


3306 


5092 


6378 


784CIP2C_110 


4486 


1521 


3307 


5093 


6379 


784CIP2C 111 


44S0 


1522 


3308 


5094 


6380 


7e4CIP2C 112 


4499 


1523 


3309 


5095 


6381 


7e4CIP2C 113 


4503 


1524 


3310 


5096 


6882 


- 784CIP2C 114 


4506 


1525 


3311 


S097 


6883 


784CIP2C 115 


4509 


1526 


3312 


5098 


6384 


784CIP2C 116 


4514 


1527 


3313 


5099 


6885 


784CIP2C 117 


4516 


1528 


3314 


5100 


6386 


784CIP2C 118 


4522 


1529 


3315 


5101 


6887 


784CIP2C 119 


4525 


1530 


331^ 


5102 


6888 


784CIP2C_120 


4527 


1531 


3317 " 


5103 


6889 


784CIP2C_121 


4528 


1532 


3318 


S104 


6890 


784CIP2C_122 


4529 


1533 


3319 


5105 


6891 


784CIP2C 123 


4532 


1534 


3320 


5106 


6892 


784CIP2C 124 


4537 


1S3S 


3321 


5107 


6893 


784CIP2C 125 


4538 


1536 


3322 


5108 


6894 


784CIP2C 126 


4551 


XS37 


3323 


5109 


6895 


784CIP2C 127 


4552 


1538 


3324 


5110 


6896 


784Cip2C 128 


4559 


1539 


3325 


5111 


6897 


784CIP2C 129 


4567 


1540 


3326 


5112 


6898 


784CIP2C 130 


4568 


1541 


3327 


5113 


6899 


V84CIP2C 132 


4S8S 


1542 


■ "3328 


5114 


6900 


784.CIP2C_133 


4592 


1543 


3329 


5115 


6901 


784CIP2C 134 


4609 


1544 


3330 


S116 


6902 


784CIP2C 135 


4616 


1545 


3331 


5117 


6903 


784CXP2C 136 


4617 


1546 


3332 


5118 


6904 


784CIP2C 137 


4618 


1547 


3333 


5119 


6905 


784CIP2C 138 


4620 
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iNSOOCID: <WO 015331 2Al_L> 



wo 01/53312 



PCT/lJSOO/34263 



oe full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SiU ID WO; 
of cont ig 
nucleot xde 
sequence 


SEQ ID 
NO : 

of contiQ 

peptide 

sequence 


Priority 
docket nutnijer 
cor re sponding 

pr i or i ty 
appl i t i on 


SEQ ID 
NO: in 
n. S . S . N. 

uy/ 4o8 1 725 


1548 


3334 


5120 


6906 


784CIP2C 139 


jjg24 


1S49 


3335 


5121 


6907 


784CIP2C 140 




1550 


3336 


5122 


6908 


784CiP2C 141 


4634 


ISSl 


3337 


5123 


6909 


784CIP2C 142 




1552 


3338 


5124 


6910 


784CIP2C 143 


4639 


1553 


3339 


5125 


6911 


7a4ClP2C 144 




1SS4 


3340 


5126 


6912 


784CIP2C 145 


4644 


1555 


3341 


5127 


6913 


7a4C?lp!pc 1 df; 


4655 


1556 


3342 


5128 


6914 


784PTPPC 14*7 


bo 


1557 


3343 


5129 


691S 




4677 


1558 


3344 


5130 


6916 


/o4dF2C 149 


4677 


1559 


3345 


, 5131 


G917 




4677 


1560 


3346 


51321 


6918 




4682 


1561 


3347 


5133 




/a4ClPXL- 153 


4690 


15C2 


3348 


S134 


6920 


784C1P2C 154 


4691 


1563 ^ 


3349 


5135 


J. 


155 


4727 


1S64 


3350 


5136 


6922 


/84CIP2C 156 


4730 


1565 


3351 


5137 


6923 


784CIP2C 157 


4734 


1S66 


3352 


513 8 


6924 


V84GIP2C 158 


475'> 


1567 


3353 


5139 


6925 


784CIP2C 159 


4764 


1568 


33 54 


5140 


6926 


784CIP2C 160 


4766 


1569 


3355 


5143, 


6927 


7e4CXP2C 161 


4793 


1570 


3 356 


5142 


6928 


784CIP2C 162 


4^25 


1571 


33 57 


5143 




784CIP2C 163 


4826 


1572 


3358 


5144 


6930 


7B4CIP2C 154 


4850 


1573 


33 59 


5145 


6931 


784CIP2C^165 


4853 


1574 


3360 


5146 




/o4Ci.P2C 166 


4855 


1575 


3351 


514*7 




/o4dP2C 167 


4856 


■1576 


3362 


5148 




*o'i^Xr£\^ 16 B 


4867 


1577 


3363 


514^ 


6935 




4869 


1578 


3364 


5150 


6936 




4878 


1579 


3365 


5151 


^937 ' 




4880 


1580 


3366 


5152 


6938 


' o*^-.XJr*c^» X /<s 


4942 


1581 


3367 


5153 


6939 




494d 


1582 


3368 


5154 


6940 


7B4CIP7C 174 




1583 


3369 


5155 


*6941 


784CIP2C 175- 




1584 


3370 


5156 


6942 


784CIP2C 176 


■ 4954 


1585 


3371 


5157 


6943 


784CIP2C 177 


4956 


1586 


3372 


5158 


6944 


784CtP2C 178 


4961 


1587 


3373 


51S9 


6945 


784CIP2C 179 


S590 


lS88' 


3374 


6l60 


^9'46 


784CIP2C 180 


5599 """ 


1589 


3375 


5161 


6947 


784CrP2C 181 


5692 


1590 


3376 


5162 


6948 


784CIP2C 182 


5732 


1591 


3377 


5163 


6949 


784CrP2C 183 


5765 


1592 


3378 


5164 


6956 


784CIP2C 184 


5V71""' "' ' 


1593 


3379 


516S 


6951 


784CtP2C leS 


5774 


1594 


3380 


5166 


6952 


784CIP2C 186 


5793 


1595 


3381 


5167 


6953 


784CIP2C 187 


5806 


1S96 


3382 


^1^8 


6954 


784CIP2C 188 " 


5852 


1597 


3383 


5169 


6955 


784CIP2C 189 


5892 


1596 


3384 


5170 


69S6 


784CIP2C 190 


60S7 


1599 


3385 


5171 


6957 


784CIP2C 191 


6061 


1600 


3386 


5172 


6958 


784CIP2C 192 


6109 


1601 


33 87 " 


5173 


6959 


7e4CIP2C 193 


6160 


1602 


3388 


5174 


6960 


784CIP2C_194 


6297 


1603 


3389 


5175 


6961 


784CIP2C 195 


6398 


1604 


3390 ' " 


5176 


6962 


784CIP2C 196 


6398 


1605 


3391 


5177 


6963 


7fl4CIP2C 197 


6415 


1606 


3392 


5178 


6964 


7e4CrP2C X98 


6448 


1607 


3393 


5179 


696S 


784CIP2C 199 


6469 


1808 


3394 


5180 


€$C6 


784CIP2C 200 


6476 


1609 


33 9S "■ - 


5181 


6967 


7a4CIP2C 201 


6561 



4 
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MSDOCID: <WO 016331 2A1..) 



wo 01/53312 



PCT/USOO/34263 



SEQ ID NO : 
Of full- 
length 
nucleotide 
sequence 


SEQ JD 
NO: Of 
full- 
length 
peptide 
sequence 


SEQ ID NO; 
of contig 
nucleotide 
seqpience 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Pjcioirl ty 

doc Jc e t n luiiiie x* 

cojTt'e spending 

SEQ ID KO: in 

priority 

application 


SEQ ID 
WO - in 
U,S »S .N. 
09/488 , 725 


1610 


3396 


5182 


6968 


7e4CrP2C_202 


6574 


1611 


3397 


5183 


6969 


784CIP2C 203 


6578 


1612 


3398 


5184 


6970 


784CIP2C 204 


6662 


1613 


3399 


S185 


6971 


784CIP2C_205 


6672 


1614 


3400 


5186 


6972 


784CIP2C_206 


6691 


1615 


3401 


5187 


6973 


784CIP2C 207 


6695 


■ 1616 


3402 


5188 


6974 


784CIP2C 208 


6746 


1S17 


3403 


S189 


6975 


7e4CIP2C 209 


6898 


1618 


3404 


5190 


6976 


7B4CIP2C 210 


6938 


1619 


3405 


5191 


6977 


7B4CIP2C 211 


6943 


1620 


3406 


5192 


6978 


784CIP2C 212 


7110 


1621 


3407 


5193 


6979 


784CIP2C 213 


7200 


1622 


3408 


5194 


6960 


784CXP2C 214 


7212 "" 


1623 


3409 


5195 


6981 


784CIP2C 215 


7218 


1624 


3410 


5196 


6982 


784CIP2C 21^ 


724g "" 


1625 


3411 




6983 


784CIP?C 217 


7Crtr^ 


1626 


3412 ~ 


5198 


6984 


784CIP2C 2liB 


7509 


1627 


3413 


5199 


6985 




•7523 


1628 


3414 


5200 


6986 


784CIP2C! 220 

# f "m V* -fc. £^ ^Ct ^ ^ V/ 


f 044 


1629 


341S 


5201 


6987 


784C3!P?r" ?5T 


f 


1630 


3416 


5202 


6988 




7568 


163X 


3417 


5203 


6989 




7631 


1632 


3418 


5204 


6990 


784CIP2C! 53d' 




1633 


3419 


5205 


6991 


784C1P2C 22S 




1634 


3420 


5206 


6992 


784CIP9C 226 




1635 


3421 


5207 


6993 


784CIP2C 227 


7907 


1636 


3422 


5208 


6994 


7fl4CiP2C 228 


7943 


1637 


3423 


5209 


6995 


784CIP2C 229 


817'S 


1638 


3424 


5210 


€9$6 


784CIP2C 230 


8216 


1639 


3425 


5211 


6997 


784CIP2C 231 


8225 


1640 


3426 


5212 


6998 


784CIP2C 232 


8271 


1641 


3427 


5213 


6999 


784CIP2C 233 


83^7 


1642 


3428 


5214 


7000 


784CIP2C_234 


8466 


1643 


3429 ~ 


5215 


7001 


784CIP2C_235 


8503 


1644 


3430 


5216 


7002 


784CIP2C 236 


8953 


1645 


3431 ' 


5217 


7003 


784CIP2C 237 


9106 


1646 


3432 


5218 


7004 


784CIP2C„238 


9139 


1647 


3433 


S219 


7005 


784CIP2C 239 


9555 


1648 


3434 ~ 


5220 


7006 


7S4CIP2C 240 


9650 


1649 


3435 


5221 


7007 


784CIP2C 241 


9889 


1550 


3436 


5222 


7008 


784<ilP2C 242 


9933 


1651 


3437 


5223 


7009 


7e4CIP2C 243 


9953 


1652 


3438 


5224 


7010 


784CIP2C 244 


99B1 


1653 


3439 


5225 


7011 


784CIP2D 1 


746 


1654 


3440 


5226 


7012 


7e4ClP2D 2 


3558 


1655 


3441 


5227 


7013 


784CIP2D 3 


3558 


1656 


3442 


5228 


7014 


784CIP2D 4 


3633 


1657 


3443 


5229 


7015 


784C1P2D 5 


3658 


1658 


3444 


5230 


7016 


'784C1P2D_6 


3732 


1659 


3445 


5231 


7017 


784CIP2D 7 


4004 


1660 


3446 


5232 


7018 


784CIP2D 8 


4700 


1661 


3447 


5233 


7019 


784C1P2D 9 


4703 


1662 


3448 


5234 


7020 


784CIP2D 10 


4774 


1663 


3449 


5235 


7021 


7a4CIP2D 11 


4894 


1664 


3450 


. S236 


7022 


7e4CIP2D 12 


4918 


1665 


3451 


5237 


7023 


784CIP2D 13 


5159 


1666 


34S2 


5238 


7024 


7,84CI"P2D 14 


7443 


1667 


3453 


5239 


7025 


7S4CIP2P 15 


8673 


1668 
^ 1669 


3454 


5240 


7026 


784CIP2D_16 


8679 




3455 


5241 


7027 


784CrP2D 17 


8727 


167D 


3456 


5242 


7028 


784CIP2D 18 " 


~ 8734 


1671 


3457 


5243 


7029 


784CIP2D 19 


S7S6 
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BN3D0CI0: <WO_01&3312At J_> 



wo «I/53312 



PCT/US(>0/342<;3 



of full- 
length 
nucleotide 
sequence 


NO : of 

full- 
length 
peptide 
sequence 


SBQ ID NO ; 
of Contig 
nucl eot ide 
sequence 


SEQ ID 
NO ; 

of contig 
pspi- j.cie 


Priority 
docket nuttiber_ 
correspondi ng 
I>bU ID NO; in 

t J. X C V 


SEQ ID 
NO: in 
U.S. S.N, 
09/488,725 


1672 


34S8 


5244 


7030 


784CTP9rj On 


6616 ^ 


1673 


3459 


5245 


7031 




8844 


1674 


3460 


5246 


7032 


Tf^a.CTD'yrt 


8646 


1675 


3461 


524 7 


7033 




8912 


1676 


3462 " 


5248 


7034 


7B4r'Tt5'5r* nA 


8916 


1677 


3463 


5249 


7035 


7fid.nTDOT\ Ot; " 


8918 


1678 


3464 


5250 


7036 


/ofti^ir^D 26 


8941 


1675 


3465 


S2S1 


703 7 




8941 


1680 


3466 


5252 


7038 


T il ri*^ "Tfc 

f lA'iX^XvZlJ ^0 


8951 


1681 


3467 


5253 




/a4CIP2D 29 


8951 


' 1682 


3463 


5254 


7040 


784CIP2D 30 


9007 


1663 


3469 


5255 


7041 


784CIP2D 31 


9012 


1684 


3470 


5256 


7042 


764CIP2D 32 


9013 


1685 


3471 


5257 


7043 


784CIP20 33 


9025 


16G6 


3472 


5256 


7044 


7e4ClP20 34 


9053 


1687 


3473 




7045 


7H4CIP2D 35 


9054 


1688 


34 74 


5260 


7046 


764CIP2D 36 


9054 


1689 


3475 


5261 


7047 


784CIP2D 37 


9113 


1690 


3476 


5262 


7048 


784CIP2D_38 


9134 


1691 


3477 


5263 


7049 


784CIP2D 39 


9152 


1692 


3478 


5264 


7050 


784CIP2D 40 


9152 


1693 


3 479 


5265 


7051 


7e4CIP2D 41 


9211 


1694 


3460 


5266 


7052 


764CIP2D 42 


9223 


1695 


3481 


5267 


7053 


784CIP2D 43 ' 


9223 


1696 


3482 


5268 


7054 


784CXP2D 44 


9231 


1697 


3 4 83 


coca" '■ 


7055 


784CIP2D 45 


9236 


1698 


3484 


/u 


7056 


784CIP2D 46 ' 


9236 


1699 


3485 


5271 


7057 


784CIP2D_47 


9303 


1700 


3486 


5272 


70S8 


784C1P2D 48 


9309 


1701 


3487 


" '5275'"'" 


7059 


7a4CIP2D 49 


9314 


1702 


3488 


5274 


7050 


784CIP2D 50 


9326 ~ 


1703 


3489 


527S 


7061 


784CIP2D 51 


9339 


1704 


3 490 


5276 




784CI12D 52 


^548 


170S 


3491 


5277 


7063 


784CIP2D 53 I 9376 


1706 


3492 


5278 


7064 


784CIP2D_54 


9382 


1707 


3493 


5279 


7065 


784CIP2D 55 


9407 


1708 


3494 


5260 


7056 


784CIP2D_56 


9414 


1709 


3495 


5281 


7067 


784CIP2b 57 ■ 


9439 


1710 


34^6 


5282 


7066 


784CIP2D 58 


9485 


1711 


3497 


5283 


7069 


764CIP2D 59 


94 93 


1712 


3498 


5284 


7070 


784CIP2D 60 


9501 


1713 


3499 


5285 


7071 


7a4CIP2D 61 " 




1714 


3500 


5286 


7072 


784CIP2D 62 " 


9526 


1715 


3501 


5287 


7073 


764CIP2D 63 


9551 


1716 


3502 


5266 


7074 


7a4CI?2D 64 


9557 


1717 


3503 


5289 


7075 


784CIP2D 6'5' " 


9568 


1718 


3504 


^290 


7076 


784CIP2D 66 " 


9588 


1719 


3505 


5291 


7077 


784Cr?2D 67 


9597 


1720 


3506 


5292 


7078 


784CIP2D 68 


9615 


1721 


3507 


5293 


7079 


784CIP2D 69 


9628 


1722 


3508 


5294 


7080 


784CIP2D 70 


9649 


1723 


3509 


5295 


7061 


784CIP2D 71 


9652 


1724 


3510 


5296 


7082 


784CIP2D 72 " ' 


9660 


1725 


3511 


5297 


7083 


7a4ClP2D ^3 


9662 


1726 


3512 


5296 


7084 " 


784CIP2D 74 


9725 


1727 


3513 


5299 


7085 


784CIP2D 75 


9746 


1728 


3514 


5300 


7086 


784CIP2D 76 


9777 


1729 


3515 


5301 


7087 


784CIP2D 77 


9787 


1730 


3516 


5302 


7088 


784CIP2D 78 


9790 


1731 


3517 


5303 


7089 


784CIP2D_79 


9842 


1732 


3S18 " 


5304 


7090 


784CIP2D 80 


9842 


1733 


3519 


5305 


7091 


784CiP2D 81 


9648 
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NSDOCIP- <WO 0153312A1 1 > 



wo 01/53312 



PCT/USOO/34263 



SEQ ID NO: 
of Cull- 

fluent t»m^i^ 


SEQ ID 
NO ; of 

full - 
length 
pep t xcte 

5!*" mi on r"^ 


SEQ ID NO: 

of contig 

nucleotide 

sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket iiumber__ 
corresponding 
SBQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


i734 


3S?0 


5306 


7092 


784CIP2D 82 


9867 


1735 


3521 


5307 


7093 


784CIP2D 83 


. 10010 1 


1736 




5308 


7094 


7e4CIP2D 84 


10011 


1737 




5309 


7095 


784CIP2D 85 


10652 


173 8 


3524 


5310 


7096 


784CIP2D 86 


10057 


173 9 


3525 


5311 


7097 


784CIP2P 87 


10085 


1740 




5312 


7098 


784CiP2D 89 


10139 


X / ^ J- 


3 527 


5313 


7099 


7B4CIP2D 90 


10142 


1742 


3528 


5314 


7100 


784CIP2D 92 


10165 


i J 


3529 


5315 


7101 


784CIP2D 93 


10173 


1744 


3530 


S316 


7102 


784CIP2I> 94 


10173 


1745 


3531 


5317 


7103 


784CIP2D 95 


I02li 


1746 


3532 


5318 


7104 


784C1P2E 1 


" 3121 


— q 


3533 


5319 


7105 


784CrP2E 2 


3628 


t lyi Q 
X O 


3534 


5320 


7106 


7e4CXP2E 4 


3 673 


1743 


3535 


5321 


7107 


784CIP2E 5 


4018 


17S0 


3536 


5322 


7108 


784CIP2E 6 


4467 


17S1 


3537 


5323 


7109 


784CXP2E 7 


4865 


1752 


3538 


5324 


7110 


^ 784CIP2E 8 


4916 


1753 


3539 


5325 


7111 


; 784CIP2E_9 


4923 


1754 


3540 


5326 


7112 


784CIP2E 10 


4926 


1 755 


3541 


5327 


7113 


784CIP2E 11 


4962 


1 756 


3542 


5328 


7114 


784CIP2E 12 


4 963 


1757 


3543 


5323 


7115 


784CIP2E 13 


4964 


1758 


3544 


5330 


7116 


784CIP2S 14 


4988 


17S9 


3545 


5331 


7117 


784CXP2E 15 


5855 


1 760 


3546 


5332 


7118 


784CIP2E 16 


7682 


1761 


3547 


5333 


7119 


784CIP2E 17 


7682 


1762 


3548 


5334 


7120 


784GIP2E 18 


7699 


X /o3 


3549 


5335 


7121 


784CIP2E 19 " 


7707 


2,'}fS4 


J550 


5336 


7122 


784CIP2E 20 


7707 


-*■ ' OI> 


3551 


5337 


7123 


784CIP2E 21 


7752 


■l766"" 


-a ceo 


5338 


7124 


784CXP2E 22 


8357 


1767 


3553 


5339 


7125 


784CIP2E 2% 


9065 


1768 


3554 


5340 


7126 


784CIP2E 24 


9324 


1769 


3555 ' 


5341 


7127 


7S4CIP2r 1 


2976 


1770 


"^titii; ' "' 


5342 


7128 


784CIP2F 2 


3559 


1771 


O 3D / 


5343 


7129 


784CIP2F 3 


4021 


1772 


3558 




7130 


784CIP2F 4 


4474 


1773 




5345 


7131 


784CIP2F 5 


4566 


1774 


3560 


5346 


7132 


784CIP2F 6 


4705 


1775 


3561 


5347 


7133 


784CIP2F 7 


4707 


1776 


3562 


5348 


7134 


784CIP2P 8 


4712 


1777 


3563 


534<> 


713S 


784CIP2F 9 


5008 ~ 


1778 


3564 


5350 


7136 


784CIP2F 10 


5009 


1779 


3565 


5351 


7137 


784CIP2F_11 


SOlS 


1780 


3566 


5352 


7138 


7B4CIP2? 12 


5015 


1781 


3567 


5353 


713 9 


784CIP2F 13 


7724 


1782 


3568 


5354 


7140 


7S4CIP2F 14 


7725 


1783 


3569 


5355 


7141 


784CIP2F 15 


8828 


1784 


3570 


S356 


7142 


784CIP2F 16 " ' 


8830 


1785 


3571 


5357 


7143 


7e4CIP2F 17 


9739 


1786 


3572 


535B 


7144 


784CIP2r 18 "' 


9896 ~ 



TRADOCS: 1 4 16247 J (%CS7aH.D0C) 
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TABLE 7 



1 SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to firat 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to firat 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^sAspartic Acid, B- 
Glutamic Acid, F=Phenylalanine, G-Glycine, 
H=Histidine, I'^Isoleucine , K«=Lysine, 
L-Lcucine; M»Methionine, N=Asparagine , 
P^Proline, Q=:Glutamine, R=Arginine, 
S=Serine, Ts=Threonine, Vi= Valine, 
W=Tryptophan, Y«Tyrosine, X=Unknown, *::.Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


5359 


337 


1131 


AHI»S ARLSAI, I LDEVA ILPAPQNLS VL»STNMKH1iLMWS P VIaJg 
ETVyySVEYQGEYBSLyTSHlWIPSSWCSLTEGPSCDVTDDITA 
TVPYNLRVRATLGSQTS /CliEHP/VS I PLIETQPSLPDL/RMEI 
TKDGFHLVIBLEDLGPQFEFLVAYWRREPGAEEHVKMVRSGGIP 
VHLETMEPGAAYCVKAQTFVKAIGRYSAFSQTECVEVQGEAIPI. 
VLALFAFVGPMLILWVPLPVWKMGRLIiQ/YLluLPRGGSSQTPW 
KITQF 


S360 


2 


1115 


PRVRSSGGQBDPASQQWARPRFTQPSKMRRRVIARPVGSSVRLK 
CVASGHPRPDITWMKDDQALrRPEAAEPRKKKWTLSIiKNLRPED 
SGKYTCRVSNRAGAINATYJCVDVIQRTRSKPVLTGrHPVNTTVD 
FGGTTSFQCKVRSDVKPVIQWLKRVEYGAEGRHNSTIDVGGQKF 
WLPTGDVWSRPDGSYI^KXLITRARQDDAGMYICLGANTMGYS 
FR5APLTVLPDPKPPGPPVASSSSATSLPWPWIGIPAGAVFIL 
GTLLItWLCOAQKKPCTPAPAPPIsPGHRPPGTARDRSGDKDLPSL 
AALSAGPGVGLCEEHGSPAAPQHLLGPQPVAGPKLYPKLYTGtta 
TPHTYTHPPPSCQtiMSSHS 


5361 


3 


925 


HEGS I S S ANI IiL.DDQPQPKLTDFAMAH?RSHtEHQSCTINMTSS 
SSKHI»WYMPEEYIRQGKLSIKTDVYSFGIVIMEVLTGCRVVI,Dn 
PKHXQLRDLLREIiMEKRGLDSCLSPLDKKVPPCPRMFSAKLFCL 
AGRCAATRAKURPSMDEVLNTLESTQASLYFAEDPPTSLKSFRC 
PSPLFLENVPSIPVEDDESQMNNLLPSDEGUilDRMTQKTPFBC 
SOSEVMFLSLDKKPESKRNEEACNMPSSSCEESWFPKYIVPSQD 
IjRPYKVNIDPSSEAPQHSCRSRPVESSCSSKPSWDEYEQYKKE 


5352 


2 


4879 


SCQVBGCTRTYNSSQSIGKHMiCTAHPDQYAAFKMQRKSKKGQrCA 
NNLNTPNNGKFVYFLPSPVNSSNPFFTSQTKANGNPACSAQLQH 
VSPPIFPAHIASVSTPLLSSMESV1NPNITSQDKN2QGGMLCSQ 
MEKLPSTALPAQMEDLTKTVLPLNIDRGSDPFLSLPAESSSIDL 

fpspadsgtnsvpsqlenwtmiyssqiegntnssfi.kggngeha 
vfpsqvnvanki^sstnaqqsapekvkkdrgrgqtgkerkpkhuk 
rakkpaiirdgkprcsrcyraftnprslgghriskrsyckpldga 
eiaqellqsngqpsllasmilsxnavnloqpqqs?fwpeacfkd 
psflqi.laenrspaflpntfprsgvtnf*ntsvsqegseiiiqal 

ETAG I PSTPEGAEMLSHVSTGCVSDASQVNATVMPNPTVPPLLH 
WCHPNTXiLTNQWRTSNSKTS'SIEECSSLPVPPTNDZJXKTVEN 
GLCSSSFPNSGGPSQNPTSNSSRVSVISGPQNTRSSHLMKKGWS 
ASKRRKKVAPPLlAPNASQNLVTSDIiTTMGLIAKSVEIPTTNLM 

snviptcepqslvenltqklnnvnnqlfmtdvkenfktsi*esht 

VLAPliTLKl^ENODSQMMALNSCrTSVNSDLQlSEDNVIQNFEKT 
LEIIKTAMNSQILEVKSaaQGAGETSQMAQINYNlQLPSVNTVQ 
NNlCLPDSSP\FSSFISVMPTESNIPQSE\VSHKEDQIQEILEGr* 
QiaKLEKDLSTPASQCVLIirrSVTI.TPTPVKSTADITVlOPVSS 
MINIQPNDKVNKPFVCQNQGCNYSAMTiCDAI.PKHYGKIHQYTPB 
MILEIKKNQLKPAPFKCWPTCTiCrFTRlTSKLRAHCQLVHHFTT 
EEMVKbKIKRPYGRKSQSENVPASRSTQVKKQLAMTEENKKBSQ 
PAtiELRAETQNTHSNVAVI PEKQhlEKKSPDKTESShQVXTVTS 
BQCmifAI/TNTQTKGRKIRRHKKBKEEKKRKKPVSQSItEFPTRY 
SPYRPYRCVHQGCFAAFTIQONLtLHYQAVHKSDItPAFSAEVEE 
ESEAGKESEETETKQTIjKEFRCQVSDCSRI FQAI TGItXQHYMICL 
HEMTPEEIES^T^ASVDVGKPPCDQLECKSSFTTYLMYVVHLEAD 
HGIGLRASKrrEEDG\nfKC33CEGCDRIYATRSNIXRHIFMKHNDK 
HKAHLIRPRRLTPGQENMSSKANQEKSKSKHRGTKHSRCX3KEGI 
KMPKTKRKKKNNriENKNAKl VQ lEENKPYS liKRCKHVYSIKARN 
DAIiSECTSRFVTQYPCMlKGCTSWTSESNIIRHYKCHKLSKAF 
TSQHRNLLIVFKRCCNSQVKETSEQEGAKNDVKDSDTCVSESND 
NSRTTATVSQKEVEKNE*DEMDELTBLFITKLrNEDSTSVETQA 
NTSSWSNDFQEDNLCQSERQKASNIiKRVKKEKWSQWKKRKVE 
KAEPASAAELSSVRKEEETAVAIQTIEEHPASFDWSSFKPMGFE 
VSFLKFLEESAVKQKXNTDKDHPKTGNKKGSHSNSRKTTIDKTAY 
TSGNHVCPCKESETPVQFANPSQU3CSDMVKrVI*DKNI*KDCTEL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-=Alanine, C«Cysteine, D«=Aspartic Acid, E- 
Glutamic Acid, F^* Phenylalanine ^ G=Glycine, 
H=Histidine, I=Isoleucine , K=i,ysine, 
L=l^eucine, M:=Methionine, N=Asparagine, 
P=Proline. Q=Glutatnii:ae, R==Arginine, 
S~Serine, T=Threonine, VsValine, 
W=Tryptophan. Y-Tyrcsine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, * 
\=possible aucleotide insertion) 








VLKQLQEMKPTVSLKKLEVHSNDPDMSVMKDISIGKATGRGQY 


5363 


8066 


703 


RLCCTGGGEGTPGASGKRGPAATTSLVLCI PSVPPPVPFPXO^P 
PPSWRRQPPGGIRRDFSRRLRREANLVATCLPVRASLPHRUSML 
RGPG PGLLLLAVIiCLGTAVPSTGASKSKROAQQMVQPOS PVAVS 
QSKPGCYDNGKHYQ INQQWERTYLGNALVCTCYGGSRGFNCESK 
PEAEETCFDKYTGNTYRVGDTyERPKDSMIWDCTCIGAGRGRIS 
CTIANRCHBGGQSYKIGDTWRRPHETGGYMIiECVCLGNGKGEWT 
CKP lAEKCFDHAAGTS YWGETWEKPYQGWMMVDCTCLGEGSGR 
ITCTSRNRCNDQDTRTSYRIGDTWSKKDNRGNIjIiQCICTGNGRG 

ewkcerhtsvqrrssgsgpftdvraavyqpqphpqpppyghcvt 
dsgwysvgmqla* ktqgnkqml\ctclgngvscqetavtqtyg 
gnsngepcvlpftyngrtfyscttegrqdghlwcsttsnyeqdq 

KYS FC^^DHTVI*VQTRGGNS^fGALaHFPPIJYWNH^rYT0CTSEGRR 
DNMKWCGTTQNYDADQKFGFCPMAAHEffilCTTNEGVMYRIGDQW 
DKQHDMGHMMRCTCVGNGRGEWTCIAYSQLRtJQClVDDITYNVN 
DTFHKRHEEGHMIiNCTCFGQGRGRWKCDPVDQCQDSETGTPYQl 
GDSWEKYVHGVRYQCYCYGRGIGEWHCXjPliQTYPSSSGPVEVFI 
TETPSQPNSHPIQWNAPQPSHISKYIIiRWRPKNSVGRWKBATIP 
GHLWS YTIKGLKPGWYEGCLIS IQQYGHQEVTRFDFTTTSTST 
PVTSNT\VTGETTPFSPLVATSESVTEITASSFWSWVSASDTV 
SGPRVEYELSEE(3DEPQYIjVLPSTATSV\NlP\DLt.PGRKyiVN 
WQISEIXSEQSLILSTSQTTAPDAPPDPTVrKSVDDTSIVVRWSR 
PQAPITGYRIVYSPSVEGSSTEIiNLPETANSVTLSDI^QPGVQYN 
ITIYAVEENQESTPWIQQETTGTPRSDTVPSPRDLQFVEVTDV 
KVTXMWTPPESAVTGYRVDVIPVNLPGEHGORIjPL.SRKTF\ABN 

tgi^pgvtttfiwfavshgreskpltaqqttklNdaptmlqfvn 

ETDSTVLVRWTPPRAQITGYRLTVGIiTRROQPRQYNVGPSVSKY 
PLRWIiQPAS E YTVSLVAI KGNQES PKATGVFTTLQPGSS I PP YW 
TEVTETTI VITWTPAPRIGFKLGVRPSQGGEAPREVTSDSGS I V i 
VSGLTPGVEYVYTIQVLRD3QERDAP \IVNK\WXPI,SPPTNLH 
LEANPDTGVLTVSWERSTTPD ITOYRITTTPTNGQQGHSIiEEW 
HAJX}SSCTF\DN1.EVPGLSYNVSVYTVKDDKESVPISDT1IPAV 
PPPTDIiRFTN/ ILGPDTMRVTWXAPPPSIDLTNFLVRYS PVKNE 
GRMLQSLS I FFI»SDN\AWLXNLIiPGTEyWSVSSVyEQHESTP 
\IiRGRQKTGLDSP\TGIDFS\DITA\NSFT\VHW\IAPRA/TPI 
TGyRXR\HHPEHF\SGRPREDR\VPHSRNSlTLTKLTPGTEYVV 
SlVALNGREBSPr*LIGQQSTVSPVPRDLEWAATPTSI»I,l\SWD 
APAVTVRYYRITYGETGGNSPVQE FTVPGSKSTATISGLKPGVD 
YTITVYAVTGRGDSPASSKPISIKYRTEIDKPSQMQVTDVQDNS 
ISVKWLPSSSPVTGYRVTTT\PKWGPG\PTKTKTAGPDQTEMTI 
EGLQPTVEYVVSVYAQNPSGESQPIiVQTAVTtriDRPKGIiAFTDV 
DVDS r 5CI AWES PQGQVSRYRVTYSSPEDGIHELFPAPDGEEDTA 
ELOGLRPGSEYTVSVVALHDDMESQPCilGTQSTAIPAPTDLKFT 
QVTPTSLSAQWTPPNVQLTGYRVRVTPKEKTGPMKE INLAPDSS 
V V V h»ljljn V A *. ft-Ici V o V xALiKX/ 1 ij 1 o jtPAvGVVTTijE W VS PPRR 
ARVTDATETTITlSWRTKrETITGFQVDAVPANGCyrPIQRTIKP 
DVRSYTITGIiOFC3TDYKIYtiYT£iNDNARSSPWIDASTAX0APS 
NLRFLATTPNSLDVSWQPPRARITGYIXKYEKPGSPPREWPRP 
RPGVTEAXITGbEPGTEYTlYVIALKNNQKSEPIilGRKKTDELP 
QLVTLPHPNLHGPEILDVPSTVQKXPFVTHPGYDTGNGIQLPGT 
SGQQPSVGQQMIFEEHGFRRXTPPXXATPIRHRPRPYPPNVGQE 
AliSQTXISWAPPQDTSEYriSCHPVGTDEEPLQFRVPGXSTSAT 
LTGLTRGATYNI X VEALKDQQRHKVRBEWTVGWS VNBGr,WQPT 
DDSCFDPYTVSHYAVGDEWBRMSESGFKLLCQCIjGFGSGHFRCD 
SSRWCHDNGVNYKIGEKWDRQGENGQMMSCrCIiGNGKGEFKCDP 
HEATCYDDGKTYHVGEQWQKEYIiGAICSCn*CFGGQRGWR<:DNCR 
RPGGEPS PEGTTGQS YNQYSQUYHQRTNTNVNCP IECFMPl>DVQ 
ADREDSRE 


S364 


8066 


703 


RLCCTGGGEGTPGASGKRGPAATTSLVIiCIPSVPPPVPFPTI/WP 
PPSWRRQPPGGIRRDFSRRLRREANIiVATCI,PVRASLPHRIJlJi(t» 
RGPGPGLIiLUVVEjCIjGTAVPSTGASKSKRQAQQMVQPQSPVAVS 
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SEQ 
ID 
NO: 


Predicccd 

beginning 

nucleotide 

location 

CO rre spondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


frecticted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide"" 
{AsAlanine, C=Cysteine, D=sA6partic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 

Leucine, M=Methionine, W=Asparagine , 
P=Prolane, Q^Glutamine, R»Arginine, 
S«Serine, T=Threonine, V-Valine, 
W=Tryptophan, Yt=Tyrosine, X^Unknown, *sStop 
Codon, /^possible nucleotide deletion, ■ 
\spossible nucleotide insertion} 








QSKPGCYDNGKHyQINCX?MERTYLGWALVCTCYGGSRGFiJCESK 

PEAEETCFDKYTGNTYRVGDTYERPKDSMIWDCTCIGAGRGRIS 

CTIANRCHEGGQSYKIGDTWRRPHETCSGYMLECVCaLGNGKGEKT 

CFPIAEKCPDHAAGTSYWGETWEKPYQGWMMVDCTCtiGEGSGR 

lTCTSRNRCm>QDTRTSYRIGDTWSiCKDNRGMLljQCICTGNGRG 

EWKCERHTSVQTTSSGSGPFTDVRAA.VYQPQPHPQPPPYGHCVT 

DSGVVYSVGMQLA*KTQGNKQML\CTCLGNGVSCQETAVTQTYG 

GNSNGEPCVLPFXyNGRTFYSCTTEGRQDGHLWCSTTSNYEQDO 

KYSPCTOHTVLVQrrRGGMSNGALCHFPFLyNimNYTDCTSEQRR 

DNMKWOGTTQNYDADQKFGFCPMAAHBEICTTNEGVMYRIGDQW 

DKQHDMGHMMRCTCVGNGRGEWTCIAYSQLRDQCIVDDITYNVN 

DTFHKRHEEGHMUNCTCFGQGRGRWKCDPVDQCQDSETGTFYQI 

GDSWEKYVHGVRYQCycyGRGIGEWHCaPLQTYPSSSGPVEVFI 

TETPSQPNSHPIQWNAPQP3H1SKYILRWRPKNSVGRMKEATIP 

GHLNS YTI KGLXPG WYEGQLIS IQQYGHQEVTRPDFTTTSTST 

PVTSNT \VTGETTPFS PLVATSES VTEITASS FWSWVSASDTV 

SGFRVEYELSEEGDEPQyi>VIirSTATSV\NIP\DI,LPGRKYIVN 

VYQISEDGEQSLXIiSTSQTTAPDAPPDPTVDQVDDTSlWRWSR 

PQAPITGYRIVYSPSVEGSSTELNLPETANSVTLSDLQPGVQYN 

ITiyAVEENQESTPWlQQETTGTPRSDTVPSPRDliQFVEVTDV 

KVTXMWTPPESAVTGYRVDVIPVNIiPGEHGQRLPLSRNTF\AEN 

TGI.SPGVTYYFJCVFAVSHGRESKPLTAQQTTKL\DAPTNLQFVN 

ETDSTVLVRWTPPRAQ I TG YRLTVGLTRRGQPRQV3JVGPSVS KY 

PLRNLQPASEYTVSLVAIKGNQESPKATGVFTTLQPGSSIPPYN 

TEVTETTIVITWTPAPRIGPKLGVRPSQGGBAPREVTSDSGSIV 

VSGLTPGVEyVYTIQVI.RDGQBRDAP\lVNK\WTPr,SPPTWi:iH 

LEANPDTGVLTVSWERSTTPDITGYRITTTPTNGQQGNSLBEW 

HADQSSCTF\DWLEVPGLEYNVSVYTVKDDKES VP ISDTI IPAV 

PPPTDLRFTN/IIiGPDTMRVTWVAPPPSIDLTNFLVRYSPVKNE 

GRMI*QSLS IFFLSDW\AWliTNLI,PGT3YWSVSS VYEQHESTP 

\LRGRQKTGLDSP\ TG JDFS \ DITA\KSFT\ VHW\ lAPRA/TPI 

TGYRIR\HHPEHF\SGRPREDR\VPHSRNSITLTWI,TPGTBYW 

SIVALNGREESPLt*IGQQSTVSDVPRDI*EVVAATPTSIjI*l\SWD 

APAVTVRYYRITYGETGGNSPVQEFTVPGSKSTAXrSGIiKPGVD 

yTITVYAVTGRGDSPASSKPISINyRTEIDKPSQ^3QVTZ^VQDNS 

ISVKWLPSSSPVTGYRVTTT\PKNGPG\PTKXKTAGPDQTEMTI 

EGLQPTVEYWSVYAQNPSGESQPLVQTAVTNIDRPKGIiAFTDV 

DVDSIKIAWESPQGQVSRYRVTYSSPEDGIHEIiFPAPDGEEDTA 

ELQGLRPGSEYTVSWALHDDMESQPLIGTQSTAIPAPTDLKFr 

QVTPTSLSAQWTPPNVQLTGYRVRVTPKEKTGPMKEINliAPDSS 

SVWSGI^ATrYEVSVYAIiiayrLTSRPAQGVVTTLENVSPPRR 

ARVTDATETTITISWRTKTETITGPQVDAVPANGQTPIQRTIKP 

DVRSrriTGLQPGTDYKIYLYTLNDNARSSPWlDASTAIDAPS 

NLRFIiATTPNSLLVSWQPPRARITGYIIKYEKPGSPPREWPRP 

RPGVTEATITGLEPGTEYTIYVrALKNWQKSEPLlGRICKTDELP 

QLVTLPHPNIiHGPEiriDVPSTVQICTPFVTHPGYDTGNGIQLPGT 

SGQQPSVGQQMIFEEHGFRRTTPPTTATPIRHRPRPyPPNVGQE 

LTGLTRGAm'IIVEAIiKDOQRHKVREEVVTVGNSVWEGLNOPT 
DDSCPDPYTVSHYAVGDEKERMSESQFKIiIjCQCWSFGSGHFRCD 
SSRWCHDMGVNYKlGEKWDRQGENGQMMSCTOiGNGKGEFKCDP 
HEATCYDDGKTYHVGEQWQKEYLGAICSCTCPGGQRGWRCDNCR 

RPGGEPSPEGTTGQSYWQYSQRYHQRTlilTNVNCPIECPMPLDVO 
ADREDSRE 




S3 6 5 


806^ 


703 


Rl>CCTGGGEGTPGASGKRGPAATTSLVLCIPSVPPPVPFPTLV?P 
PPSWRRQPPGGIRROFSRRLRREANLVATCIiPVRASLPHRt.NML 
RGPGPGLLLLAVLCLGTAVPSTGASKSKRQAQQMVQPQSPVAVS 
QSKPGCYDNGKHYQ I NQQWERTYLGNAIiVCTCYGGSRGFNCES K 
PEAEETCFDKYTGWTYRVGDTYERPKDSMIWDCTCIGAGRGRIS 
C^riANRCHEGGQSYKrG[XIWRPHETX3GYMr£CVCLGNrGKGE»r 
CKPXAEKCFDHAAGTSYWGETWBKPYQGWMMVDCTCIiGEGSGR 
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SEQ 
ID 
NO: 


tredicted 

beginning 

nucleotide 

location 

CO r re spond i ng 

CO first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
cecjuence 


Ammo acxd segment containing signal peptide 
(A=Alanine, C^Cysteine, D^Asparcic Acid, E:= 
Glutamic Acid, F= Phenyl alanine, G^rGlycine, 
H-Histidine, I^Isoleucine, K=;Iiysine, 
L*=lieucine, M^Methionine, N=Asparagine, 
P=Proline, Q=Glutaaiine, R^Arginine, 
S=:Serine, r=Threonine, V«Valxne, 
W^Tryptophan, Y^Tyrosine, X=Unknown, *«stop 
Codon, /-possible nucleotide deletion, ' 
X'^possible nucleotide insertion) 




I'- c 








I^CTSR^^^L^W^qDTRTSYRIGDTWSKKDNRGNLLQCtCTGNGI^G^ 

EWKCERHTSVQTTSSGSGPFTDVRAAVyQPQPHPQPPpyGHOjT 

DSGVVYSVGMQU\*KTQGNKQML\CTCLGNGVSCQETAVTQTYG 

GNSNGEPCVLPrmfGRTFYSCTTEGRQDGHLWCSTTSNYEQDQ 

KYSFCTDHTVLVQTRGGNSNKSALCHFPFLyKNHmrTDCTSEGRR 

DNMKWCGTTQNYDADQKFGPCPMAAHEEICTTMEGVMyRrGDOW 

DKQHDMGHMMRCTCVGNGRGEWTCIAYSQLRDQCXVDDITYNVN 

DTFHKRHBEGHMLNCTCPGQGRGRWKCDPVDQCQDSETGTPYQI 

GDSWEKyVHGVRYQCYCYGRGIGEWHCQPLQTYPSSSGPVEVFI 

TETPSQPKSHPIQWNAPQPSHISKYILRWRPKWSVGRWKBATIP 

GHLNSYTIKGLKPGWYEGQLISIQQYGHQEVTRnSPTTTSTST 

pvtsmtWtgettpfsplvatsesvtbitasspwswvsasdtv 

SGFRVEYELSEEGDEPQYI.VLPSTATSV\NIP\Dr,LPGRlCyiVN 
VYQISEDGEQSLILSTSQTTAPDAPPDPTVDQVDDTSIWRWSR 
PQAPITGYRIVYSPSVEGSSTELNLPETANSVTLSDLQPGVQVN 
ITIYAVEENOBSTPWIQQETTGTPRSDTVPSPRDLQFVEVTDV 
KVTIMWTPPESAVTGYRVDVIPVNLPGEHGQRLPrxSRNTFVAEN 
TGLSPGVTYYFKVFAVSHGRESKPI,TAQQTTKL\DAPTNLQFVN 
ETDSTVLVRWTPPRAQITGYRLTVGLTRRGQPRQYNVGPSVSKY- 
PLRNLQPASEYTVSLVAIKG-VQESPKATGVFTTLQPGSSIPPYN 
TEVTETTIVITWTPAPRIGFICLGVRPSQGGBAPREVTSDSGSIV 

vsgltpgveyvytiqvlrdgqerdapXivnkWvtplspptnlh 

LEANPDTGVI,TVSWERSTTPDITGYRITTTPTNGQQGNSLEEW 

HADQS SCTF\ DNLRVPGLEY^IVS VYTVKDDKESVPISDTI IPAV 

PPPTDUiFTN/ILGPDTMRVTW\APPPSIDLTNFI*VRYSPVKNE 

GRMJLQSLSIPFLSDN\AVVLrNX.LPGTSYWSVSSVyEQHESTP 

\LRGRQICrGLDSP\TCIDFS\DITA\NSFT\VHW\rAPRA/TPI 

TGYRIR\HHPEnF\SGRPREDR\VPHSRNSITI,TNI>TPGTEyw 

SIVAI*NGREESPLrjrGQQSTVSDVPRDLEWAATPTSLLI\SWD 

APAVTVRYYRITYGETGGKSPVQEFTVPGSKSTATlSGIiKPGVD 

YTITVYAVTGRGDSPASS:;. ISINYRrEIDKPSQMQVTDVQDWS 

ISVKWLPSSSPVTGyRVTTT\PKNGPG\PTKTKTAGPDQTEMTI 

EGLQPTVEYWSVYAQNPSGESQPLVQTAVTNIDRPKGUVFTDV 

DVDSIKIAWESPQGQVSRYRVTYSSPEX)GIHEI.PPAPDGEEDTA 

ELQGLRPGSEYTVSWALHDDMESQPLIGTCJSTAZPAPTBLKPT 

QVTPTSLSAQWTPPNVQLTGYRVRVTPKEKTGPMKElNIiAPDSS 

SVWSGIiMVATKYWSVYALKDTtjTSRPAQGVVTTLENVSPPRR 

ARVTDATETTITISWRTKTETITGFQVDAVPANGQTPIQRTIKP 

DVRSYTITGLQPGTDYKIYLYTLNDNARSSPWIDASTAIDAPS 

NbRFLATTPNSLLVSWQPPRARITGYI IKYEKPGSPPREWPRP 

RPGVTEATrTGLEPGTEYTIYVIALKNNQKSEPLIGRKKTDELP 

Ql.VTLPHPNriHGPErU>VPS7VQKTPFVTttPGyDTGI7GIQLPGT 

SGQQPS VGQQM I FEEHGFRRTTPPTTATP IRHRPRPYPPNVGQE 

ALSQTTISWAPPQDTSEYIISCHPVGTDEEPLQFRVPGTSTSAT 

LTGLTRGATYNI IVEALKDQQRHKVRBEWTVGWSVNEGIiNQPT 

DDSCFDPyWSKYAVGDEWERMSBSGFKLriCQCLGFGSGHFRCD 

SSRWCHDNGVNYKIGEKWDRQGEMGQMMSCTCIiGNGRGEPKCDP 

HEATCYDDGFCTYHVGEQWQKEYLGAICSCTCFGGQRGWRCDNCR 

RPGGEPSPEGTTGQSYNQYSQRYHQRTNTNVNCPIECFMPLDVQ 
ADREDSRE 




1 


B066 


703 

] 

] 
C 
< 

] 
I 
I 

c 


KX.CCTGGGEGTPGASGKRGPAATTSIiVLClPSVPPPVPFPfLWP 
PPSWRRQPPGGIRRDFSRRLRREANLVATCLPVRASLPHRLNML 
RGPGPGLLLIiAVLCLGTAVPSTGASKSKRQAQQMVQPQSPVAVS 
aSKPGCYDNGKHYQINQQWERTYLGWALVCTCYGGSRGPNCESK 
PEAEETCPDKYTGNTYRVGDTYERPKDSMIWDCTCIGAGRGRIS 
^TIANRCHEGGQSYKIGDTfrJRRPHETGGYMLECVCLONGKGEWT 

:kpiabkcfdhaagtsywgetwbkpyqgp/mmvdctclgbgsgr 

CTCTSRNRCNDODTRTSYRIGDTWSiOCDNRGNLXySClCTGNGRG 
:S?KCERHTSVQTTSSGSGPFTDVRAAVYQPQPHPQPppyGHCVT 
)SGWYSVGMQLA*KTQGNKQML\CTCLGNGVSCQETAVTQTYG4' 
JNSNGEPCVLPPTYNGRTPYSCTTEGRQDGHXiWCSTTSNYEQDQ 
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SEQ 
ID 
.NO: 


Predicted 
beginning 
nucleotide 
location 
correspond i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signai peptide""" 
{A=Alanine, C=Cysteine, D^spartic Acid, Est 
Glutamic Acid, FsPhenylalanine, G=Glycine, 
HsiHistidine, l:=Isoleucine, K*rLysine, 
Ls:Leucine, M-Methionine , N=Asparagine, 
P*: Proline, 2=Glut amine, JR^Arginine, 
S^Serine, T=Threonine, V^Valine, 
W=Tryptophan, y= Tyrosine, X^^Unknown, *=eStop 
Codon, /=*possible nucleotide deletion, ■ 
\«possible nucleotide insertion) 








KySFCn-DHTVLVCyrRGGWSNGALCHFPPtYWWHNVTDCTSEGRR 
DNMKMCGTTQWYDADQKFGFCPMAAHEEICTTNEGVMYRIGDQW 
DKQHDMGHMMRCrCVGNGRGEWTCrAySQLRDQCIVDDlTYNW 
DTFHKHHEEGHMLNCTCFGQGRGRWKCDPVDQCQDSETGTFyQI 
GDSWEKYVHGVRyQCYCyGRGlGEWHCQPLQTYPSSSGPVEVFI 
TETPSQPNSHPIQWKAPQPSHISKYItoRWRPKNSVGRWKEATIP 
GHLNSYTIKGLKPGWYEGQLISIQQYGHQEVTRFDFTTTSTST 
PVTSNT\VTGETTPFSPLVATSESVTEITASSFWSWVSASDTV 
SGFRVEYELSEEGDEPQYIiVLPSTATSV\NIP\DLLPCRKYIVN 
VYQ IS EDGEQSLI LSTSQTTAPDAPPDPTVDQVDDTS I WRWSR 
PQAPrTGYRlVYSPSVEGSSTELNI.P3TANSVTliSDLQPGVQYN 
ITIYAVEENQESTPWIQQETTGTPRSDTVPSPRDIiQFVEVTDV 
KVTI MWTPPESAVTG YRVDVI PVK1>PGEHG0RLPLSRWTF\AEW 
TGLSPGVTYYFKVFAVSHGRESKPIiTAQQTTKLXDAPTNLQFVN 
ETDSTVIiVRWTPPRAQITGYRLTVGLTRRGQPRQYNVGPSVSKY 
PLRNLQPASEYTVSLVAI KGNQES PKATGVFTTLQPGSS IPPYN 
TEVTETTIVin^rPAPRIGFKLGVRPSQGGEAPREVTSDSGSIV 
VSGLTPGVEYVYTIQVLRDGQERDAPV ivnk \vvtplspptnlh 
LEANPDTGVtiTVSWERSTTPDXTGYRITTTPTNGQQGNSIjEEW 
IIADQSSCTP\DNLjEVPGIiEYNVSVYTVKDDKESVPISDTIiPAV 
PPPTDtiRFTN/ ILGPDTMRVTW\APPPS IDLTKFLVRYSPVKNE 
GRMLQSLSIFFLSDN\AWLTNItLPGTEYWSVSSVYEQHESTP 
\LRGRQKTGLDS P \TG I0FS \D1TA\NSFT \VHW\IAPRA/TPI 
TGYRI R\HHPEHP\SGRPREDR\VPHSRNS ITLTNLTPGTEYW 
SlVAliMGREESPLIiIGQQSTVSDVPRDL.EWAATPTSIil\SWD 

apavtvryyritygetggnspvqeftvpgskstatisglkpgvd 
ytitvyavtgugdspasskpi s inyrte idkpsqmqvtdvqdns 
isvkwlpssspvtgyrvtttVpkngpgXptktktagpdqtemti 
eglqptveywsvyaqnpsgesqplvqtavthldrpkgiiaftdv 
dvds i kiawes pogqvsryrvtyss pedgihelfpapjdgeedta 
eiiqglrpgseytvswalhddmesqpligtqstaipaptdlkft 

QVTPTSLSAQWl^P PNVQLTGYRVRVTPKEKTGPMKEINLAPDSS 
SWVSGIJWATKYEVSVYALKDTLTSRPAQGVVTirLENVSPPRR 
ARVTDATETTITISWRTKTETITGFQVDAVPANGQTPIQRTIKP 
DVRSYTITGLQPGTDYKIYLYTIiNDNARSSPWinAS'EAIDAPS 
NURFLATTPMSriliVSWQPPRARITGYIIKYEKPGSPPREWPRP 
RPGVTEATITGLEPGTEyTIYVIALKNNQKSEPI,IGRKKTDEIjP 
QLVTItPHPNliHGPEirjDVPSTVQKTPFVraPGyDTGNGIQLPGT 
SGQQPS VGQQM I FEEHGFRRTTPPTTATPIRHRPRPYPPNVGQE 
ALSQTTISWTlPFQDTSEryilSCHPVGTDEEPLQFRVPGTSTSAT 
LTGLTRGATYWI IVEAIiKDQQRHKVREEVVTVGNSVNEGLNQpr 
DDSCFDPYTVSKYAVGDBKERMSESGFKLUX2CIiGFGSQHFRCD 
SSRWCHDNGVNYKIGEKWDRQGENGQMMSCTCLGNGKGEFKCDP 
HEATCYDDGKTYHVGEQWQKEYIXSAICSCrCPGGQRGWRCDNCR 
RPGGEPSPEGTTGQSYNQYSQRYHQRTNTNVNCPIECFMPLDVQ 
ADREDSRE 


5367 


235 


3591 


KKILNMLCKKNXVIEYLADILYEYLYGFCPSGIKKYLtlHVUeL ~ 

EVMMRKQDS/RIVGNGSEQQLQKELADVIiMDPPMDDQPGEKBLV 
KRSQLDGEGDGPLSNQLSASSTINPVPLVGLOKPEMSIjPVKPGQ 
GDSEASSPFTPVADEDSWFSKLTYIjGCASVNAPRSEVEALRMM 

sri,rsqcqisldvtlsvpnvsegivrlij3pqtnteiamypiyki 
lfcvrghdgtpesdcfafteshynaelfrihvfrceiqeavsri 
lyspatafrrsakqtplsataapqtpdsdiftfsvsleikeddg 
kgyfsavpkdkdrqcfklrqgidkkiviyvqqttmkelaiercf 

GLLIiSPGKDVRNSDMHIiLDLESMGKSSDGKSYVTTOSWNPKSPH 

fqwneetpkdkvi,fmttavdlvitevqepvrflletkvrvcsp 
nerlfwpfskrstteirffi.klkqrkqrerknntdti,yevvcdbs 
esererrkttaspsvrlpqsgsqssvipsppeddeeedndepll 
sgsgdvskecaekiletwgellskwhlni.kvrpkqlsslvrngu 
pealrgevwqllagchnndhlvekyrilitikespqdsaltrdln 
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SEQ 
ID 
KO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
seq[uence 


Predicted end 

nucleotide 

location 

CO rre spending 

to first 

amino acid 

residue of 

amino acid 

sequence 


Amino acid segment containing signal peptide " 
(A==W.anine, C=Cysteine, D^Aspartic Acid, 
Glutamic Acid, P^Phenyl alanine, G^Glycine, 
H^Histidine, l=Isoleucine, K=tliysine, 
L-=Ijeucine, M-Methionine, N-Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S-Serine, T«Threonine, V=Vallne, 
"-iiytjc.^jpii<«i, i~iyrosxne, A5=unlcnown, *«Stop 
Codon, /«=possible nucleotide deletion, ' 
\=possible nucleotide insertion) 








RTFPAHDYFKDTGGDGQDSLYKICKAySVYDEEIGYCQGQSFLA 
AVLLLHMPEEQAFSVLVKIMPDYGLRELFKQNFEDLHCKFYQbE 
RLMQEyiPDX,YNHFI*DISLEAHMYASQWFI,rLFTAKFPI,YMVFH 
IIDLIiLCBGISVTFNVALGLLKTSKDDLW.rDFEGALKFFRVQL 
PKRYRSEENAKKLMELACNMKISQKKLKKYEKEYHTMREQQAQQ 
EDPrERFERE»IRRI^EANMRLEQENDDLAHBI.VTSiaALRKDLD 
NAEEKADALWK3LLMTKQKLIDAEEEKKRLEEESAHLKKMCRRE 
LDfCAESEIKKNSSriGDYKQICSQLSERLEKQQTAKKVEIEKIR 
QKVDDCERCREFFNKEGRVKGlSSTKEVIjDEDTDEEKETLKNQri 
REMELEI4AQTKL\QLVEASCK1QD\LEHPF*GLPFNE\VQAA\K 
KTWFNRTLSSIKTATGVQGKETC 


5368 


573 


2014 


gaaagaadprrgsumrtmldfaifavtflLalvgavlylypas 

RQAAGIPGITPTEEKDGNLPDIVNSGSLHEFLVNLHERYGPWS 
FWFGRRLWSI/3TVDVLKQHIWPNKTLD/IiF*NHAEVIIKVSIW 
WWQCE*KP\QRiaCLYBNGVTDSLKSNPALLI,KliPEELLDKWt*Sy 
PETQH\VPLSQHMt/3FAMKSVfQMVMGSTFEDDQEVlRFQKNHG 
TVWSEIGKGFLDGSLDKWMTRKKQYEDALMQLESVLRNIIKERK 
GRNFSQHIFIDSLVQGNLNDCX3IDED3M1FSIASCIITAKLCTW 
A2WFLTTSEEVQKKLyEEINQVFGWGPVTPEKIEQI.RYCQHVLC 
ETVRTAKLTPVSAQLQDIEGKIDRFIIPRETLVLYALGWI^DP 
NTMPSPHKFDPDRFDDEt*VMKTFSSLGFSGTQECPELRPAYMVT 
TVLLSVIiVKRLHLLSVEGQVIETKYELVTSSREEAWITVSKRY 


5369 


1 


6622 


PRSLCFSliWAEAAVIjADGGIiRRRRRt.I^GTMSASFVPNGA$LBD 

CHOSTLFCI^LTGIKWKKTVWQGPTSAPILFPVTEEDPILSSPS 

RCI.KADVLG/VWRRDQRPERRE\L* IFWGGEDPX VLLTI.FTMTy 

QKKKMECX3RMDFPMNAVLCFSKAVHNU^RCt^INRNFVRIGICMF 

VKPYEKDEKPlNKSEHLSCSrrFFLHGDSNVCTSVEINQHQPVY 

LLSEEHITIAQQSNSPFQVILCPFGLNGTLTGQAFKMSDSATKK 

LIGEWKQFYP ISCCLKBMSBEKQEDMDWEDDSIAAVEVLVAGVR 

MIYPACFVLVPQSDIPTPSPVGSTHCSSSCLGVHQVPASTRDPA 

MSSVTLTPPTSPEEVQTVDPQSVQKWVKFSSVSDGFNSDSTSHH 

GGKI PRKLANHWDRVWQECNMNRAQNKRKYSASSOGLCEEATA 

AKVASWDFVBATQRTNCSCLRHKNIiKSRNAGQQGQAPSLGQQQQ 

IIjPICH1CTNEKQEKSEKPQKRPLTPFHHRVSVSDDVC5MD\ADS\A 

SQRLV\ISAP\DSO\VRFSNrR\TKDVAK\TPQMHGTEMANSPQ 

PPPI*SP\HPCDWDEGVTKTPSTPQSQHFYQMPTPDPLVPSKPM 

EDRIDSLSQSFPPQYQEAVEPTVYVGTAVNLEEDEANIAWKYYK 

FPKKKDVEFI/PPQLPSDKFKDDPVGPFGQESVTSVTELMVQCKK 

PLKVSDELVQQYQlKWQCLSArASDAEQEPKIDPYAFVEGDEEF 

liFPDKKDRQNSEREAGKKHKVEDGTSSVTVLSHEEDAMSLFSPS 

IKQDAPRPTSKARPPSTSLIYDSDLAVSYTDLDWLiFNSDEDELT 

PGSKRSANGSDDKASCKESKTGNLDPLSCISTADLHKMYPTPPS 

LEQKIMGFSPMNMNNKEYGSMDTTPGGTVLEGNSSSIGAQFKIE 

VDEGFCSPKPSEIKX>PSYVYKPENCQILVGCSMFAPLKTI.PSQy 

LPLIKLPEECIYRQSWTVGKLELLSSGPSMPPIKBGDGSNMDOE 

YGTAYTPQTHTSCGMPP5SAPPSNSGAGII,PSPSTPRFPTPRTP 

RTPRTPRGAGGPASAQGSVKYENSDLYSPASTPSTCRPLKSVEP 

ATVPSIPEAHSLYVNLILSESVMNLFKDCNSDSCCICVCNMNIK 

GADVGVYlPDPTQEAQYRCTCGFSAVMWRKFGNNSGIiFFEDELD 

XJGRNTDCGKEABKRFEAIJiATSABHVIfGGLKESEKLSDDLILL 

LOIXJCriTLFSPFGTUyDQDPFPKSGVlSmnmVEERDCCa^DCYlA 

LEHGRQFMDNMSGGKVDEALVKSSCIiHPHSKRNDVSMQCSQDlL 

RMLLSLQPVLQDAIOKKRTVRPWGVQGPIiTWQQFHKMAGRGSYG 

TDESPBPLPIPTFLLGYDYDYLVLSPFABPYWERLMLEPYGSQR 

DIAYWLCPENEALIiNGAKSFFRDLTAIYESCRLGQHRPVSRLL 

TDGIMRVGSTASKKLSEKLVAEWFSQAADGNNEAFSKLKLYAQV 

CRYDLGPYLASLPLDSSLLSQPNLVAPTSQSIilTPPQMTNTGNA 

NTPSATlASAASSTMTVTSGVAIS'rSVATANSTLTTASTSSSSS 

SNLNSGVSSNKLPSFPPPGSMKSNAAGSMSTOANTVQSGQr.GGQ . 

^TSALQTAGISGESSSLPTQPHPDVSESTMDRDKVGIPTDGDsJf 

^VTYPPAIVVyiiDPFTYENTDESTNSSSVWTLGI/IJlCFIJsaW 
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SEQ 
ID 
NO: 


Predicted 
begrinning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing sxghal peptide "~ 
(A= Alanine, CaCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F«< Phenyl alanine^ G^Glycina, 
H»Histidine, l«l3oleucine, K=Lysine, 
L=Leucine, M=«Methionine, NsAsparagine , 
P=Proline, Q-Glutamine, R-Argi»ine, 
S^Serine, T=sThreonine, V= Valine, 
W-Tryptophan^ Y=Tyrosine, X^Unknown, *s=stop 
Codon, /^possible nucleotide deletion, ' 
\=possible nucleotide insertion) 








TLPPH IKSTVS VQI I PCQYLLQPVKHEDREI YPQHLKSIiAFSAP 
TQCRRPIiPTSTNVKTLTGFGPGLAMETAIiRS PDRPECI RliYAyp 
FILAPVKDKQTELGETFGEAGQKYNVLFVGYCLSHDQRWrLASC 
TDLYGELLETCriNIDVPNRARRKKSSARKFGLQKLWEWCLGLv 
QMSSLPWRWIGRLGRIGHGELKDWSCLLSRRNLQSLSKRLKDM 
CRMCGISAADSPSIIiSACLVAMEPQGSFVlMPDSVSTGSVFGRS 
TTLNMQTSQUMTPQDTSCrr-tlLVFPTSASVQVASATYTTENUDL 
AFKPNNDGADGMGXFDLIiDTGDDLDPDI INI LPASPTGS PVHSP 
GSHYPHGGDAGKGQSTDRLI>STEPHEEVPK I WJQPLALGYFVST 
AKAGPLPDWFWSACPQAQYQCPLFLKASIiHLHVPSVQSDEIjIiHS 
KHSHPLDSNQTSDVLRFVLEQYWAI*SWIjTCDPATQDRRSCLPIH 
FWLNQT..YNPIMNM1. 


S370 


1226 


716 


RWSRKLELRRAAQATESRPPQSQEMHPPTGKEVHALKRLRDSAN 
ANDVETVQQLLEDGADPCAADDKGRTALHFASOTGNDQIVQIjLr. 
DHGADPNQRDGItGNTPLHIjAACTNHVPVITTLLRGGASVDALCR 

agrtplhlaksklmilqeghaqclkavr/hggeaohpyaegvsg 
aprat*aarcsgvfpspsrwlgsapwsrssctiwslplheakcr 

AVRPLSSAAQGSAPSSSSCCTVSTSX4AIiAESLSLFRACTSI,PVG 
GCISS^L 


5371 


1331 


167 


lAAMLWKLXjLRSQSCRLCSFRKMRSPPKYRPFIAaFTYTfSiOgs" 
S KENTRTVE KL YKCS VD I RKI RR \ * KDG Y F * RMKPMLKKLRI / P 
I>QELGADETAVAS ILERCPSAl VCS PTAVNTQRKLWQIjVCKNEE 
ELIKLIEQFPESFFTIKDQSNQKX^QFFQELGLKNVVlSRtiLT 
AAPNVFHNPVEKNKQMVRIIOESYLDVGGSEANMKVWLIiKIiLSQ 
NPFI liU^SPTAlKETLEFl^OEQGFTSFE I LQLLSKLKGFIjFQLC 
PRS IQNS ISFSKNAFKCTDHDLKQUVLKCPALLYYSVPVLEERM 
QGLLREaiSIAQIRETPKVI,EI*TPQIVQYRIRKIiNSSGYRIKDG 
HLANLNGSKKEFEANFGKICJAKKVRPLFNPVAPIiNVEE 


S3 72 


51 


857 


SPGAQFLWAAPDMPDPLFSAVQGKDEILHKAXiCFCPWLGKGGME 
PtiRIiLILLFV^ELSGAHKTTVFCym^GQSIiQVSCPYDSMKHWGR 
RKAMCRQLGEKGPCQRWSTHNLWIjIiSFLRRWNGSTAITDDTLG 
GTLTITrJUJIiQPHDAGr.yQCQSIjHGSEADTLRKVLVEVrjVDPLD 
HRDAQDLWFPG\DUlASRM?MWSTASPaASWKEKSPSHPI.PSFS 
SWPASFSSRF*QPAPSGLQPGMDRSQGHIHPVNWTVAMTQGISS 
KLCQG 


5373 


2814 


346 


VKKTKSIFNSTy^OEMEVYVENIRRKFGVFNYSPPRTPYTPNSQY 
QMLLDPTNPSAOTAKIBKQEKVKLMFrWTASPKIIiMSKPVLSGG 
TORRISLSDMPRSPMSTNSSVHTGSDVEQDAEKKATSSHFSASE 
ESMPFIJ3KSTASPASTKTGQAGSLSGSPKPFSPQLSAPITTKXD 
KTSTTGSILMLWUJRSKAEMDLKELSESVQQQSTPVPLISPKRQ 
IRSRFQIiMliDKTIESCKAQLGIKEISEDVYTAVEHSDSEDSEKS 
DSS0SEYISDDEQKS*GTSQEDTEDKEGCQMDKEPSAVKKKPKP 
TNPVEIKEELKSTSPASEKADPGAVKDKASPEPEKDFSGKAKPS 
PHPlKDKLKGKDETDSPTVHLGLDSDSE\NEIiVIDl<GEDHSGRE 
GRKNKKEPKEPSPKQDWGKTPPSTTVGSHSPPETPVIiTRSSAQ 
TSAAGATATTSTSSTVTVTAPAPAATGSPVKKQRPIiPKE\TAP 
AVQRSCGTSSTVQQKEITQSPSTSTXTtjVTSTQSSPIiVTSSGSM 
STLVSSVNGDLPIGTASADVAADIAKYTSKX#\MDAIKGTM\TEI 
YNDLSKNXrnrJKAQIAEDSQGIJlIEIEKXQWXiHQQELXSEMKHN 
LELTMAEMRQSWBQBRDRLIABVKKQLELEKOQAVOETiaCKQWC 
ANFKKSAIFYCCWNTsyCDYPCQ\QAHWPEH\MKSCTQSATAPQ 
\QEAnAE\VNTETLNKSSQGSSSSTQSAPSETASA\SKEKETSA 
EKSKESGSTLDLSGSRETPSSILIiGSNQGSDHSR\StlKSSWSSS 
DEKRGS\TRSDHK/TPSTQHGRSLIiPGKESRAGTPFIi6TSK 


5374 


2814 


346 


VKKTKSlFNSAMQEMEVyVENIRRKFGVFNY^PFRTPYTPWSQY 
QMLiajPTNPSAGTAKIPKQBfCVKLNFDMTASPKILMSKPVIiSGG 
TGRRISI^SDMPRSPMSTNSSVUTGSDVEQnABKKATSSHPSASE 
ESMDFLDKSTAS PASTKTGQAGSIiSGS PKPFSPQLSAPITTKTD 
KTSTTGSILNLNLDRSKAEMDtiKELSESVQQQSTPVPLISPKRQ 
IRSRFQLPTLDKTIESCKAQLGXNEISEDVYTAVEHSDSEDSE!^ 
DSSDSEYISDDE0KS*GTSQEDTEDKEGCC5MDKEPSAVKKKPKP 



306 



A 



NJSDOCID: <WO 0153312A1J_> 



wo 01/53312 



PCT/US(M>/34263 



SBQ 
ID 
NO: 


Predicteca 
beginning 
mic-leotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing sigriial pepcide 
(A=: Alanine, C=Cysteine, D^^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^^Glycine, 
H=Histidine, I=Isolsucine, K^Lysine, 
L^l^eucine, M'^Methionine, N=Asparagine , 
P= Proline, Q-Glut amine, U-Arginine, 
S-Serine, TsThreonine, V=Valine, 
W-Tryptophan^ Y=Tyrosine, X-Unknown^ ♦-Stop 
Codon, /=possible nucleotide deletion, * 
\»possible nucleotide insertion) 








TNPVEI KEELKSTS PAS EKADPGAVKUKASPEPEKDPSGKAKPS 
PHP I KDKLKGKDBTDS PTVHLGLDSDSE\NELVIDLGEDHSGI;E 

grknkkepkepspkqdwgkxppsttvgshsppetpvltrssaq 
tsaagatattstsstvt vtapapaatgs p vkkqrpllpke \tap 
avqrscgtsstvqqkeitqspststittivtstqssplvtssgsm 
sti,vssvnadlpigtasadvaadiakytskii\mdaikgtm\tei 
vndlskn\ttwkaqlaedsqgijrxereklqwlhqqel\semkim 
leltmaemrqsweqerdrliaevkkqlelekqqavdbtkkkqwc 
anfkkeaxfyccmntsvcdypcq\qahwpeh\mksctqsatapq 
\qeadae\vntetunkssqgsssstqsapsetasa\skeketsa 
ekskesgstldlsgsretpssiliigsnqgsdhsr\snksswsss 
dekrgsNtrsdhn/tpstqhgrsllpgkesragtpflgtsk 


S37S 


2907 


1116 


hiflaeeepmlerrcrgplamgpaqprliisgpsqespq'tligkes 
rgi,rqqgtsva\qsgaqapgrahrcahcrrhfpgwva\lwlhtr 
rcqa/rglplpcpecgrrfrhappialhrqvhaaatpdwgfach 
lcgqsprgwvalvlhlrahsaakagppacpkmardafwrrkaas 

SSILRRCHPSRPRGPRPPICGNOGRSILPTWDQ/IiKVAHKRVHV 
SRRP* ERGPPAKVFWGPRPRGPPTGDTPPGPGGDAVDRP F\QCA 
CCGKRFRHK\PNrirRSHAACrSGERPHQ/CSRECG\KRFTNKPV 
LTSXHRRITHTARQPYPCKECGRRFRHKPNLLSHSKIHKRSEGS 

aqaapgpgspqlpagpqesaaeptpavplkpaqepppgappehp 
qdp leappslys cddcgrs?rlerfiirahqrqhtgerpftcaec 
gknfgkkthlvahsrvhsgerpfrlarkogrrplprasqsggrn 
saepnaprfgpfvcpdcgkaetrhkpylaahrpiatpaekpyvcp 

DCRKAFSQKSNLWSHRRIHXGERPYACPDCDRSFSQKSNIiITH 
RKSHIRDGAFCCAIOGQTFDDBERIiXAHQKICHDV 


5376 


4504 


591 


VSTFSIiCLWPAGGGGRGRVS>lMAQSKRHVYSRTPSGSRMSAEAS 
ARPLRVGSRVEVlGKGHRGl-VAYVGATLFATGKimSVILDEAKG 
KNDGTVQGRKYFTCDEGHGIFVRQSQIQVFEDGADTTSPETPDS 
SASKVLKREGTDTTAKTSKI.RGX1KPKKAPTARKTTTRRPKPTRP 
ASTGVAGASSSI^PSGSASAGELSSSEPSTPAQTPIAAPIIPTP 
VLTS PGAVPPIiPS PSKEEEGLRAQVRDliEEKIiETIiRLKRAEDKA 
KtiKEZiBKHKIQIiEQVQEWKSKMQEOQADLQRRLKBARKEAKEAL 
EAKERYMEEMADTADAIBMATIjDKEMl\EERAESIiQQEVEAIiKER 
VDELTTDLEILKAEIEBKGSDGAASSYQLKQLEEQNARLKDAIiV 
RMRDI^SSEKQEHVX\LQKLMEKKNQEr.EVVRQQRSRI^EELSQ 
AESTI DBLKEOVPAAIXSAEEMVEMLTDRNTitJLEEKVREIJlETVG 
DLEAMNEMNDBIiQENTUlETEuELRBQLDMAGARVREAQKRVEAA 
QETVADYQQTIKKYRQLTAHLQDVNRELTNQQEASVERQQQPPP 
ETFDFKrKFAETKAHAKAIEMELRQMEVAQAKRHMSLLTAFKPD 
SFLRPGGDHDCVLVLLLMPRLICKAELIRKQAQEKrELSENCSE 
RPGIiRGAAGEQl^SFAAlGIjVYVSLMPAAGHRYHRY^aiALSQCR 
LDVVYKKVGSLYPEMSAHERSLDFLIEI^LHKDQLDETVNVEPIiT 
KAIKYYQHIiYS IHLAEQPEDCTMQLAOHIKFTQSALDCMSVEVG 
RLRAFLQGGQEATDIALLLRDIiETSCsKDlRQPCKXIRRRMPGT 
DAPGlPAALAPGPQVSDTLliDCRKHLTWVVAVIiQEVAAAAAQIiI 
APLAENEGLLVAALEELAFKASEQIYGTPSSSPYECliRQSCCrXL 

EGLGIiKr^EDRBTVIKELKKSLOKGEEtiSEANVRIiTLLEKKLDS 
AAKDADERIEKVQTRI*EETQALLRKKE KEFEETMDALQAD IDQL 
EAEKAELKQRLMSQSKRTIEGLRGPPPSGIATLVSGIAGEEQQR 
GAIPGQAPGSVPGPGLVKDSPLLLQQISAMRLHISQLQHENSIL 
KGAQMKASLASLPPIilVAKLSHEGPGSELPAGALYRKrSQLIiET 
LNQLSTHTHWDITRTSPAAKSPSAQIiMEQVAQIiKSLSDTVEK]:. 
KDEVLKETVSQRPGATVPTDFATFPSSAFIjRAKEBQODDTVYMO 
KVTFSCAAGFGORHRLVLTQEQtiHQLHSRLlS 


5377 


7S2 


1106 


DVPCKRVLPAEAQEKGQLTLSCGESGEEG\F*YHEVRQAEGES* 
/WPGPNVIU-VHTQIiKTKKPSGTtJCAKFYLHTGSTKFAARISCTK 
SS*WPG YDGWWGGQYI FIFRGMRWEEQP 


5373 


2009 


664 


QASGTTlO^PLPDI^PQX^KRREATSRNRAliKPRGRIiVmTSCLPAIit 
RFI ATPRXiS AMPHl DKDVKLDFKDVI.I1RPKRSTI1KSRSEVDLTR 
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SEQ 
ID 

m-. 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide — 
(A^Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, Phenylalanine, G-Glycine, 
H-Histidine, It=lsoleucine, K=bysine, 
li- Leucine, M=Methionxne, N-Asparagine, 
P=::Proline, Q=sGlut amine, R-Arginine, 
S=Serine, T=.Threonine , V« Valine, 
W»Tryptophan, Y-Tyxoaine, X=Unknowr., *=Stop 
Codon, /=possible nucleotide deletion, * 
\=pos5ible nucleotide insertion) 






( 


SFS FRNSKQTYSGVP I lAANMDTVGTFBMAKVbCKS * VPGSFWD~ 
VPQMGCVFLI YKLFTliKWKMI^LLSVLLPAS ILVAEKFSLPTi^VH 
KHYSLVQWQEFAGQNPDCLEHLAASSGTGSSDPEQLEQILEAIP 
QVKYlCLDVANGySEHFVEPVKDVRKRFPQttTIMAGNWTGEMV 
EELI LSGADI I KVG IG PGS VCTTRKKTGVQYPQLSAVMEGADAA 
HGLKGHIISDGGCSCPGDVAKAFGAGADFVMLGGMIAGHSESGG 
Et.! ERDOKKYKIiFYGMS S * I \AM VKKYAGGVAEYRASEGKTVHV 
PFKGDVEHTrRDILGGIRSTCTYVGAAKLKELSRRTTFIRVTQQ 
VMPIFSEAC 


5379 


2009 


€64 


QASGTTLRPLPDLPQLKRREATSRNRALKPRGRtiVLMTSCLPAl. ' " 

RFIAXPRLSAMPHIDNDVKLDFfCDVLrjRPKRSrLKSRSEVDLTR 

SFSFRNSKQTYSaVPIIAANMDTVGTFEMAKVl*CKS*VPGSFWD 

VPQMGCVFliI YKLFTLKWKMLLLSVliLPAS 1 LVAEKFSLFTAVH 

KHYSLVQWQEFAGQNPDCLEHtAASSGTGSSDFEQIiEQrLEAIP 

QVKYICLDVANGYSEHFVBPVKDVRKRFPQHTIMAGNWTGEMV 

EELILSGADI I KVGIGPGSVCTTRFCKTGVGYPQLSAVMECADAA 

HGLKGHI I SDGGCS CPGDVAKAFGAGADFVMLGGMIAGHSESGG 

BIiIERDGKiCYKLPyGMSS * I \AM\KKYAGGVAEYRASEGKTVEV 

PPKGDVEHTIRDILGGIRSTCTYVGAAKIiKEI^SRRTTPlRVTQQ 

VNPIPSEAC 


5380 


2 


2050 


PSRAGGAERGRAAAARS PGGSAAGWBCPSVLDEAGACTMSSCVS 
SQ PSSNRAAPQDELGGRGSSSS ESQKPCEALRG JjSS I»S IHLGME 
SFIVVTECEPGCAVDI^LARDRPLEADGQEVPLDTSGSQARPHL 
SGRKLSLQERSQGGIiAAGGSI*DMNGRCI CPSLPYSPVSSPQSS P 
RLPRRPTVESHHVSITGMQDCVQLNQYTLKDEIGKGSYGWKLA 
YNEWDNTYYAt-IKVLSKKKtiXRQAAFPRRPPPRGTRPAPGGCIQP 
RGPI \EQVYQEIA\ ILKKLDHPNW\KLVBVr,\DDPNEDHLYMV 
FXELVNQGPVMEVPTLKPLSEDQARF YFQDLI KGIEYLHYQKI I 
HXrdIKPSNLLVGEDGHIKIADFGVSNBFKGSDATiCjSNTVGTPA t , 
FMAPESLSETRKIFSGKALDVWAMG VTLYCFVFG* CP FMDERXM 
CLHSKIKSQALEFPDQPD I AEDLKD]:,ITRMLDKNPESRI WPE I 
JaHPWVTRHGAEPLPSEDSNCTLVEVTEEEVENSVKHIPSIATV 
ILVXTMIRKRSF^3NPFEGSRRBERSLSAPGNI,LTKKPTRECBSI. 
SEIiKT* Kl S PIjPACCKVT* EFPHPSGCRPSCWQPPFUHTHSQPR 

*pepprtdealcpyetqrtcv4aplilqvlwwvgtplpfplstswr» 
pdlvgapgshfcflnial::*rynshtm 


S3BJL 


2 


2050 


PSRAGGAERGRAAAARSPGGSAAGWECPSVl.DjEAGACTMSSCVS 
SQPSSNRAAPQDELGGRGSSSSESQKPCEALRGLSS1*SIEIIjGME 

spiwtecepgcavdlglardrpleaixsqevplbtsgsqarphi. 
sgrklslqersqgglaaggsldmngrcicpslpyspvsspqssp 
rlprrptveshhvs itgmqdcvqiinqyti^kdelgkgsygwfcla 

YNBNDNTYYAMKVX/SKKKLiIRQAAFPRRPPPRGTRPAPGGCXQP 

RGPI VEQVYQE r A\ IX.ICKIiDHPNWVKLVEVL\DDPNEDHLYMV 

F\EbVNC3GPVMEVPTI,KPI*SEDQARFYFQDLIKGIEYIJiyQKII 

H\RD I KPSNLLVGEDGHI KIADPGVSNEFKGSDALLSNTVGTPA 

FMAPESLSETRKIFSGKALDVWAMGVTIiYCFVFG*CPFMDERIM 

CLHSKrKSQALEFPDQPDIAEDLKDLITRMr,DKWrPESRIWPEI 

roartf wv XKJrttjAIiPtjPSEDENC7rijtVEVTEEE\^ENSVKH 

ILVKTMIRKRSFGWPFEGSRREERSLSAPGNLLTKKPTRECESL 

SELKT*KISPIiPACCKVT*SFPHPSGCRPSCWQPPFLHTHSQPR 

♦PEPPRTDEALCPYETGRTCWAPLLQVLVWVGTPLPFPLSTSWL 

PDLVGAPGSHFCFLWI7U*LRYNSHTM - 


5382 


1536 


203 


GARGSQQDAPALQEABVRGPERAQPARGRMTKARLFRI,WLVX.GS 
VFMI LLII Vy WDSAGAAHFYLHTSFSR PHTGPPLPTPGPDRDRE 
LTADSDVDEPLDKPLSAGVKQSDIiPRKETEQPPAPGSMBESVRG 
YBWSPRDARRSPDQGRQQAERRSVLRGFCANSSIiAFPTKERPFD 
DIPNSELSHLIVDDRHGAXYCYVPKVACTNWKRVMIVLSGSLIiH 
RGAPYRDPtiRI PREHVHNASAHLTFNKFWRRYGKLSRHLMKVKCi 
KKYTKFLFVRDP FVRLl S APRS KFELEMEEF/ * PQVRRAHAAAV. 
RQPHQPARIiGARGLPRWPQ\VSFANPIQYLLDPHT3KXiAPFM^ 
WRQVYRLCHPCQIDyDFVGKT^IJJEDAAQrJ-QLIiQVtlLAAPLP 
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SEQ 
ID 
NO: 


Pred:icced 
beginning 
nucleotide 
location 
correspondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine/ C-Cysteine, D=Aspartic Acid, E- 
Glutamic Acid. F-Phenylalsir.ine, G=Glycine, 
H=Histidine, Is^Isoleucirxe, K==Lysine, 
i:i=Leucine, M=Methionine, N^sAsparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W^Tryptophan, Y^Tyxosine, X=Unknown, *-Stop 
Codon, /-possible nucleotide deletion, * 
\=possible nucleotide insertion) 








PEX>PGTGPPSSWEEDWFAKlPIiAWRCX3LyKl/YBADFVLFGypjkP 
ENLLRD 


5383 


45 


52S0 


VERLLGCRNSKRTWRMLiSKNMPWRiaiQGISFGMYSAEELKKLS~" 

VKSITMPRYLDSLGNPSANGLYDLALGPADSKBVCSTCVQDFSN 

CSGHr^HIEI,Pi:»TVYWPLL.FDKI*Yi:J^LRGSCLWaiMX4TCPRAVI 

HLLLCQLRVliEVGALiQAVyEIiERIIiSRFLEBNADPSASEIREEL 

EQYTTEIVQNWlil^SQGAHVKNVCESKSKLIAlieWKAHMKIAKRC 

PHCKTGRSWRKEHNSKLTITPPAMVHRTAGQKDSEPLGIEEAQ 

IGKRGYLTPTSAREHLSAIiWKbrEGFFliEmjFSGMDDDGMESRFN 

PSVFFtiDPLWPPSRSRPVSRLGDQMFTNGQTVNLQAVMKDWI* 

IRKLLALMAQEQKLPEEVATPTTDEEKDSLlAIDRSPliSTLPGQ 

SLIDKLYNlWIRLQSHVNIVFDSEMDKLMMDKYPGIRQItiEKKE 

GLFRKHMMQKRVDYAARSVICPDMYINTNEIGIPMVFATKLTYP 

QPVTPWNVQELRQAVINGPNVHPGASMVINEIX3SRTALSAVDMT 

QREAVAKQLI.TPATGAPKPQGTKIVCRHVKNGDILLLNRQPTLH 

RPS IQAHRAR I IiPEEKVLRLHYANCKAYNADPDGDEMNAHFPQS 

EI^RAEAYVLACTOJQYLVPKIXKJPIAGLIQDHMVSGASMTTRG 

CFFTREHYMELVYRGLTDICVGRVKLLSPSILKPPPLWTGKQVVS 

TLLINI I PEDHIPLNLSGKAKITGKAWVKETPRS VPQFNPDSMC 

ESQVIIREGELl*CaVLDKAHYGSSAYGLVHCCyEIYGGETSGKV 

t.TauARLFTAYl.QLYRGFTla3VSDILVKPKADVKRQRIIEESTH 

CGPQAVRAAI*NLPEAASYD3VRGKWQDAHIiGKI>QRDPNMI DLKF 

KEEVWHYSNE INKACMPFGLHRQFPENTIiQLMVQSGAKGSTVNT 

M0ISCni.GQIEL.EGRSTPLMASGKSl*PCFEPYEFTPRAGOFVTG 

RFLTGIKPPEFFFHOIAGREGLVDTAVICTSRSGYLQRCI I KHLE 

GLWQYDLTVRDSDGSWQFLYGEDGLDIPIcrQFIiQPKQFPFIA 

SNYBVIMKSQHLHEVTLSRADPKKALHHFRAIKKWQSKHPNTLIiR 

ROAFLSYSQKIQEAVKAIiKIiESENRNGR/RPWDS/G/RMLRMWY 

EU>EES RRKYQKKAAACPDPSL>SVHRPDI YFAS VS BTFETKVDD 

YSQEWAAQTEKSYEKSELSLDRLRTLLQIiVKWQRSIiCEPGEAVG 

liLAAQS IGEPSTQMTLKTFHFAGRGEMNVTLGI PRLREILMVAS 

AKIKTPMMSV P VLNTKKALKRVKSLKKQLTRVCLGEVLQKIDVQ 

ESFCMEEKQNKFQVYQLRFQFLPHAYYQQEKCLRPEDILRFMET 

RPFKLLMES IKKKNNKASAFRNVNTRRATQRDLDNAGELGRSRG 

EQEGDEEEBGHIVDAEAEEGpADASDAKRKEKQEEEVDYESEEE 

EEREGEENDDEDMQEERNPHREGARKTQEQDEEVGIi/GH*GGPV 

PSRPPDAAPETHPQPGAPGA\EAMERRVQAVREIHPPIDDYQYD 

TEESLWCQVTVKIiPLMKINFDMSSLVVSLAHGAVIYATKGITRC 

LLNETIUNKNE KEIiVUJTEG INLPELFKYAE VLDIiRRIjYSND IH 

aiantygieaalrviekeikdvfavygiavdprhlslvadymcp 
egvykplnrfgirsnssplqqmtfetsfqflkqatmlgshdblr 
spsaclwgkwrggtglpelkqplr 


53 84 




ft ft c 
o Ob 


QSCX^RliPT VIj* li* GPPGSCPCt]jSIiF\PGRPHALPSIRPYINI 
TXLKGDKGDPGPMGLPGYMGREGPCXSEPGPQGSKGDKGEMGSPG 
A?C0KRFFAFSVGRKTALESGEDFQTLIiFERVFVNLDGCFr»4AT 
GQFAAPJURGiyFFSIiNVHSWNYKETYVHIMHNQKEAVILYAQPS 
ERSIMQSQSVMIiDUVYGDRVWVRLFKRQREMAlYSNDFDTYITF 
SGHLIKAEDD 


S38S 


326 


739 


LMVPRTKKEAPAPPKAEAKAKAXiXKAKKAVLKDVKSHKKKKIHM 
SPTPRRPKTIi*LRRQPKYPWKSTPRRKKLDHHVTIKFPLTTE*A 
VKKIE^^KSLLVFTVPVKRNKHQrKQAVKK/LCDIDVAKVWTIilQ 
SDGERKAYVRIaAPDYDAliWATKIGIT 


5386 


326 


799 


LMVPRTKKEAPAPPKAEAKAKAL \ KAKKAVUOJVHSHKKNKIHM 
SPTFRRPKTL*LRRQPKYPWKSTPRR1IKLDHHVIIKFPLTTE*A 
VKKIENWSLL*VFTVDVKANKHQrKQAVKK/X.CDIDVAKVNTIiIQ 
SDGERKAYVRLAPDYDALWATKIGIT 


5387 


2 


2117 


FWAASGGCWFVLGERRAGSLLSASYGTFAMPGMVIiFGRRWAIA 
SDDLVFPGFFELWRVLWWIGILTLYLMHRGKIiDCAGGAIjIiSSY 
L I VLMII-IAWICTVSAI MCVSMRGTICMPGPRKSMSKLLYIRL 
ALFPPEM VWASZjCSAAWADGVQCDRTVVNG riATVVVS WT 1 lA^ 
TVVS II I VFDPIiGGKMAPYSSAGPSHLDSHDSSQriliNGI/KTAAT 
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SBQ 
ID 
NO : 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue o£ 
amino acid 
sequence 


predicted end 
nucleotide 
location 
CO r re spond i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
(AsAlanine, C=Cysteine, DsAspartic Acid^ E« 
Glutamic Acid, P= Phenyl alanine, G«Glycine, 
H=ttictidine, I-Isoleucine, Kx-Lysine, 
L=Leucine , M-Me thionine , N^Asparagine , 
P-Proline, O^Glutamine, Rs=Arginine, 
S=Serine^ T=rThreonine, V=Vallne, 
Ws^Tryptophan, Y -Tyrosine, X=Onknown, *siStop 
Codon, /^possible nucleotide deletion, 
\=:possible nucleotide insertion) 








SVWETRIKLLCCCIGKDDHTRVAPSSrAELFSTyPSDTDI/VPSD ~ 

iaaglallhqqqdnirnnqspaqwchapgssoeadldaeliSjc 
hhymqpaaaaygwplyiyrnpltgl»crlggdccrsknpqtmt/m 
vggdqlql/ctsapiiihthraavqglhprqlpwtrftelpflva 
rdhrkesvwavrgtmslqdvltdlsaesevldveoevqdriah 
kg isqaaryvyqrlindgr lsqafs lapeyrlvi vghslgc3gaa 
allattwraaypqvrcyapspprglwskalqeysqsfivslvijg 

KDVIPRLSVTNLEDLKRRrLRWAHCNKPKYKIIjtiHGI.WyEIiPG 
GNPNNliPTEIiDGGDQEVLTQPLLGEQSLLTRWSPAYSFSSDSPL 
DSSPKYPPIiYPPGRIIHLOEEGASGRFGCCSAAHYSAKMSHEAE 
FS K I L 1 GPKMIiTDHM PD ILMRALDS WSDRAACVSCPAQGVSSV 
DVA 


S388 


1569 


753 


TADGGAGGGGRRQAGVRRHYLYPPTG^^RRRRAACQjAERPAXRS 
KDTDUVAYQKGNLGVQLRNMAQETNHSQVPMl/CSTGCXSFYGMPR 
TNGMCS VCYKEHLQRQNSSNGRIS P PVQCTDGS VPEAQSALDST 
SSSMQPSPVSNQSLliSBSVASSQLDSTSVDKAVPETBDVQASVS 
DTAQQPSEEQSKSLE\NRNKKRIAVSCAGRKWDLLGO«AGVEMF 
TWyTVTQMYTIALTITKQMDKNFVFQQEPKSFGSFHQQXOiBYK 
ILEKLQTKN 


5383 


1569 


753 


TADGGAGGGGRRQAGVkRJtlVLVPFTTiGlffeRftRAXeQAERPAXfeS 
KDTDLTAYQKGNIXSVQLRNMAOETKHSQVPMLCSTGaSFYGNPR 
TNGMCSVCYKBHl^RQNSSNGRISPPVQCTDGSVPHAQSALDST 
SSSMQPSPVSNQSLIiSESVASSQLDSTSVDKAVPETEDVQASVS 
DTACKJPSEEQSKSLE\NRMKKRIAVSCAGRKWDI*I*Gt*JAGVEKF 
TVVYTVTQMYT1ALTITKQMIiKNFVFC3QEFKSFGSFHQQI,IjEYK 
Xr.EHLQTKN 


5390 


217 


1332 


edprklmedkmwsecegpemslvcltdfqahareqlskstfebfi 
eggaddsitrddniaafkrirlrprylrdvsevdtrttiqgeei 
sapiciaptgphclvwpdgemstaraaqar\gicyitstfascs ^ 

LEDIVlAAPEGJ^WFQLYVHPDI^IiNKQIjIQRVESLGFkAIiVrT 
LDTPVCGWRRMb'lRNQr>RRNLTIiTDIiQSPKKGNAIPYPQMTprS 
TSLCWNDLSWFQSXTRIjPI ILKGILTKEDAEIxAVKKNVQGI I vs 
NHGGRQLDEVtAS I DALTE WAAVKGKI EVYLDGGVRTGKDVLK 
ALAI.GAKCIFLGDAILWAIiASKGEHGVKEVLNII,TNEPHTSMA\ 
LTGCRS VAE INRISTLVQPSRIr " 


S351 


1 


1292 


VKKAAGRSRGPPTAGGQRCEEAPGTVMERRLGVRAWVKENRGSF 
QPPVCWKr^lQEQLKVMPVGGPNTRKDYHIEEGBEVFYQIiEGDM 
VUIVLEQGKHRDWIRQGEIFLLPARVPHSPQRFANTVGLWER 
RRLETELDGriRYYVGDTMDVLFEKMFYCKDLGTQLAPIIQEFFS 
SEQYRTGKPIPDQLLKEPPFPLSTRSIMEPMSIiDAWIiDSHHREL 
QAGTPI.Sr.FGDTYETQVIAYGQGSSEGIiRQKVDW71.WQl,BGSSV 
VTMGGRRLSLGPWMDSItLVLSWGPSYVAWVERTQGSVALSVrXQ 
DPACKKSPWGEPSCHGLKAATGVPSTLEVPSLPNNSPSPHYLSV 
YCRCVPHRPAHCCHPPSCPSQPRCIJAPQRAAAPHZiIiWQTQPTAL 
PVLPGOLPPAPM^P I PI/SLQTQCSTSTPRRPSIKAS 


S392 


1 


1623 


I RGSNAQKWGAS GSGGAGPQPDPAGPGGVPAliAAAVIjGACE PR 
CAAPCPLPAT^SRCRGAGSRGSRGGRGAAOSGDAAAAAEWIRKGS 
PIHKPAHGWi:aPDARVrX3PGVSYVVRYMGCaEVIJiSMRSt»DFNT 
RT^VTREAllJRItttEAVPGVRGSWKKKAPNKALASVI/GKSNLRFA 
GMSIS IHISTDGLSLSVPATRQVIANHHMPSISFASGGDTI»m) 
YVAYVAKDPXNQRACHIl4ECCEGL\AQSIISTVGQAFJ2IiRFKQY 
LHSPPKVAXjPPERXiAGPEESAWGDEEDSXiEHNYYNSIPGKEPPL 
GGLVDSRIiALTQPCAI.TALDQGPSPSLRDACSLPWDVGSTGTAP 
PGDGYVQADARGPPDHEEHLYVNTQGLDAPEPEDSPKKDLFDMR 
PFEDT^KLHECSVAAGVTAAPLPLEDQVJPSPPTRRAPVAPTEEQ 
LRQEPWYHGRMSRRAAERMLRADGDFLVRDSVTNPGQYVLTGMH 

agqpkhlllvdpegvvrtkdvlfesisklidhhlqngqpivaae 
selhlrgwsrep 


S3 93 


2 


9B2 


GGDSAGMTMETWSQNVCPRNLWIiIjQPt.TVUJLilASADSQAAJ^ • 
PKAVLKliEPPWINVbQX EDSVTliTCQGAPQP/ERSDS XQWFHI* 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre spending 
to first 
amino acid 
residue Of 
amino acid 
seofuence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F= Phenyl alanine, G-Glycine, 
H««Histidine, I*-Isoleucine, K=t.ysine, 
L«Leucine, M=Methionine, N«Asparagine , 
P= Proline, Q -Glut amine, R-Arginine, 
S=Serine, T=s:Thre on ine, V=Vallne, 
W=Tryptophan, y=Tyrosine, X=Unknown, *=Stop. 
Codon, /sspossible nucleotide deletion, 
\»possible nucleotide insertion) 








\nlipthtqps\yrfkannn\dsgeytcqtgqtsl\sdpvhi.tv 
lsewlvlqtphiji3fqegetimlrchs\wrdkp\lvkvtffqnek 

SQKFSHliDPTFSIPQANHSHSGDYHCTGNlGyXI^FSSKPVTITV 
QVPSMGSSSPMGIIVAWIATAVAAIVAAWAIiiyCRKKRISAN 
STDPVKAAQFEPPGRQMIAIRKRQLEETNNDYKTADGGYMTIiNP 
RAPTDDDKNITLTLPPNDHVNSNN 


S394 


2 


982 


GGDSAGMTMETQMSQNVCPim/WLLQPLTVLLLLASADSQAAAP 
PKAVLKLEPPWXNVLQ\EDSVTLTCQGAPQP/ERSDSIQWFHNG 
\NI*IPTHTQPS\yRFKANNN\DSGEYTCQTGQTSL\SDPVHLTV 
LSEWLVLQTPHLE FQEGETIMLRCHS \ WRDKP\liVKVTFFQNGK 
SQKFSHLDPTFSZPQANHSHSGDYHCTGWIGyTIiFSSKPVTXTV 
QV PSMGSSS PMGI I VAWI ATAVAAIVAAWALI YC3?KKRISAN 
STDPVKAAQFEPPGRQMIAIRKRQLEETNNDYETADGGYMTLNP 
RAPTDDDKNI YLTLPPNDHVNSNN 


53 9S 


3135 


531 


RASDAKNQEGIiIiMTRRKSTDSVPISKSTLSRSIiSIjQASDFDGAS 
SSGNPEAVAIAPDAYSTGSS SASSTLKRTKKPRPPSLKKKQTTK 
KPTETPPVKETQQEPDEESLVPSGENLASETKTESAKTEGPSPA 
LLEETPLEPAAGPKAACPLDSESVEGWPPASGGGRVQNSPPVG 
RKTLPtiTTAPEAGBVTPSDSGGQEDSPAKGHSVRLEFDYSEDKS 
S WDNQQENP P PTKKIGKKP VAKMPLRRP KMKKTPEKIJDNTPASP 
PRSPAEPNDIPXAKGTYTFDIDKWDDPNFNPFSSTSKMQESPKL 
PQQS YNFDPDTCDESVDPFKTSSKTPSS PSKSPAS FEIPASAME 
ANGVDGDGI^KPAKKKKTPLKTDTFRVKKSPKRSPIjSDPPSQDP 
TPAATPETPPVlSAVVHATDEEKIiAVTNQKWTCMTVDLEADKQD 
yPQPSDI/STFVNETKFSSPTEELDyRNSYEIEYMEKIGSSLPQD 
DDAPKKQALYLMFDTSQESPVKSSPVRMSESPTPCSGSSFEETE 
ALVNTAAKNQKPVPRGrAPWQESHLQVPEKSSQKELEAmLGTP 
SEAIEITAPEGSFASADAIiLSRlAHPVSLOGALDYLBPDrAEKN 
PPLFAQKLQRKAAHPTDVSISKTALYSRIGTAEVEKPAGLbFQQ 
PDLDSAIiQIARAEIITKERE^/SEWKDKYEESRREVMEMRKIVAE 
YEKTlAQMIEDBQREKSVSXHQTVQOIiVIiEKEQANtJVDIiNSVEK 
VSLADLFRRYEKMKEVI^GFRKNEEVLKRCAQEYLSRVKKEEQR 
YQAI*KVHA\EEIUjDRANAE \ lAQVRGKAQQEQAAHQASLAERSS 
CRV\DAI,ERTI*EQKNKEXEELTKICDEI*IAKMGKS 


5396 


3135 


S31 


RASPAXNQEGLLNTRRKSTDSVPISKSTLSRSi*SLQASDt^DGAS 
S SGNPEAVAIiAPDAYSTGS SSASSTLKRTKKPRPPSLKKKQTTK 
KPTETPPVKETQQEPDEESLVPSGENItASETKTESARTEQPSPA 
LLBETPLEPAAGPKAACPLDSESVEGWPPASGGGRVQNSPPV6 
RKTLPliTTAPEAGEVTPSDSGGQEDSPAKGHSVRUEFDYSEDKS 
SWDNQQENPPPTKKrGKKPVAKMPIiRRPKMKKTPEKbDNTPAS? 
PRSPAEPNDIP IAKGTYTFD:: DKWDDPNFNPPSS TSKMQESPKL 
PQQSYWFDPDTCDESVDPFKTSSKTPSSPSKSPASFEXPASAME 
ANGVDGDGX^NKPAKKKKTPLKTDTPRVKKSPKRSPLSDPPSQDP 
TPAATPETPPVISAVVHATDEBKXiAVTNQKWTCMTVDXjEADKQD 
YPQPSDLSTFVWETKFSSPTEErjDYRMSYSIEYMEKIGSSr.PQD 
DDAPKKQALYLMFDTSQESPVKSSPVRMSESPTPCSGSSFEETE 
ALVNTAAKNQHPVPRGLAPKQESHIjQVPEKSSQKELEAMGLGTP 
SEAIEITAPEGSFASADALIiSRLAHPVSIiCGALDYLEPDLAEKN 
PPLFAQKLQREAAHPTDVSISKTALYSR IGTABVEKPAGTiLFOQ 
PDLDSALQIARAEI ITKEREVSEWKDKYEE5RREVMEMRKIVAE 
YEKTlAQMIEDEQREKSVS\HQTVQQLVXiEKEQA\XADLNSVEK 
\SI^LFRRYEKMKE\^EGFRKNEEVIjKKCAQEYIiSRVKKEEQR 
yQALKVHA\EBKLDRANAE\ lAQVRGKAQQBQAAHQASLAERSS 
CRVXDAIiERTLEQKNKEIEEtiTKICDEXiIAKMGKS 


5397 


3135 


S31 


RASDAKWQEG£iLNTRRKSTDSVPISKSTIjSRSLSX,QASDFDGAS 

ssgnpeavalapdaystgsssasstlkrtkkprppsuckkqttk 
kptetppvketqqepdeeslvpsgbmiasetio^saktegpspa 
lleetpitepaagpkaacpudsesvegwppasgggrvqnsppvg 

RKTtiPLTTAPEAGEVTPSDSGGQEDSPAKGHSVRLEFDYSEDKS 

swdi^qenppptkkxgkkpvakmpr.rrpkmkktpeklidntpas» 
prspaepndipiakgtytfdxdkwddpnfnpfsstskmqespkl 



3U 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide"" 
(A^Alanine, c^Cystelne, JD^Aspartic Acid^ E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
Hs^Histidine, Itslsoleucine, K^Lysine, 
L=Leucine , M=Methi onine , N=Asparagine , 
P-Proline, Q=«Glutamine, R=Ar3inine, 
S=Scrine, T=Threonine, V-Valine, 
W=Tryptophan, Y=Tyrosine, X^UnJcnown, *=.Stop 
CodoHi /-possible nucleo^ld*» Hoi <»t- < * 
\=possible nucleotide insertion) 








PQQSyWFDPDTCJDESVDPFKTSSKTPSSPSKSPASFEiPASAME 
ANGVIX;iX5l^KPAKKKKTPLKTDTt'RVKKSPKRSPliSDPPS0DP 
TPAATPETPPVXSAVVHATDEEKIAVTNQKWTCMTVDZiEADKQD 
yPQPSDIiSTFVNETKFSSPTBBLDYRNSYEIEYMBKrGSSI.^QD 
DDAPKKQALYLMFPTSQESPVKSSPVRMSESPTPCSGSSFBETE 
ALVNTAAKNQHPVPRGLAPNQESHLQVPEKSSQKELEAMGLGTP 
SEAIEITAPBGSFASADALLSRIAHPVSI.CGALDyLEPDI*AEXN 
PPLFAQKLQREAAHPTDVSXSKTALYSRIGTAEVEKPAGLLFQQ 
PDLDSALQIARAEIITKERBVSEWKDKYEESRREVMEMRKIVAE 
YEKTIAQMIEDBQREKSVS \ HQTVQQLVUBKEQAXtiADLNS VEK 
\SliADLFRRYEKMKEVIiEGPRKNEEVLKRCAQEYl.SRVKKEEQR 
YQALKVHA\BEKLDRANAE \ lAQVRGKAQQEOAAHQASLAERSS 
CRV\DALBRTLEQKNKEIBELTKICDELIAKMGKS 


5398 


56 


5426 


SGEVCRMESNFNQEGVPRPS YVFSADP lARPSE INFDGIKtiDLS 

HBPSI^VAPNTEANSFESKDYLQVCLRIRPPTQSBKELESEGCVH 

ILDSQTWLKEPQCILGRX*SEKSSG\QM\AQKFSFPPGPLGPAT 

TQKEPFQGCIMHP\VKDLLKGQSRIilPTYGLTNSGKTYTFQGTE 

ENIRILPRTLNVIiFDSLQERLYTKMNIiKPHRSRBYLRLSSEQEK 

EEIASKSALLRQIKEVTVHNDSDDTLYGSIiTNSXiNISEFEESIK 

DYEQANLNMAWS IKFSVWVSFFEIYNKYIYDI.FVPVSSKFQKRK 

MLRIfiQDVKGYSFIKDLQWIQVSDSKEAYRLLKLGIKHQSVAFT 

KLNNASSRSHSIPTVKILQIEDSEMSRVIRVSEI»SLCDIJ\GSER 

TMKTQNEGERLRETGNINTSLLTLGKCINVLKNSEKSKFQQHVP 

FRESiCLTHYP/QSFFNGKGKICMIVNISQCYIAYDETt.NVLKFS 

AIAQKVCVPDTLNSSQEKLPGPVKSSQDVSLDSNSNSKILNVKR 

ATISWEWSLEDLMEDEDLVEELENABETED/VGETKLLDEDLDK 

TIiEENKAPrSHEEKRKU:.DriIEDLKKia:,INEKKEKl,TLEFKlRE 

EVTOEFTQYWAOREADFKETIiliOERElLEENAERRliAXFKDItVG 

KC0TREEAAKDICATKVETEEATACLELKPNQIKAEIAKTKGEL U 

IKTKEELKKRE^7ESDSLrQELETSWICKIITQNQRIKEr,I^^llDQ 

KEDTINEFQNLKSHMENTPKCNDKADTSSLIINNKLICNETVEV 

PKDSKSKICSERKRVWENELQQDEPPAKKGSIHVSSAITEDQKK 

SBEVRPNIAEIBDIRVLQeNNEGI,RAPLLTIE^TEI.KNEKBEKAE 

LNKQXVHFQQSt^LSBKKNLTLSKEVQQIQSNYDIAIAEIiHVQK 

SKNQEQEEKIMKXSNEIETi^TRSITlimVSQIKliMHTKlDEtJlTI* 

rSVSOISNIDIJ:^RDI^NGSEBONLPNTQU)LrjGNDYL.VSKQV 

KEYRIQEPNRENSFHS SIEAI WEECKEI VKASSKKSHQ lEELEQ 

QIEKLQAEVKGYKDENNRLKEKEHKNQDDLLKEKETIilQQLKBE 

IK?EKNVTLDVQIQHVVEGKRALSELTQGVTCyKAKIKEI.ETir.B 

TQKVERSHSAKLEQDlI/EKESIILKLEHHtiKEFQEHIiQDSVKNT 

KDIiNVKELiaKEEITOIiTNNLQDMKHIiLQLKEEEEETNRQETEK 

LKEBriSASSARTQNXliWADLQRKEEDYADLKEKLTDAKKQIKQV 

QK^SVmDEDKl4LRlKINEI^KKklTCK:SQELDMKQR\TIQQLK 

EQX»rNQKVEEAI QQYERACKDLNVKEKI lEDMRMTLEEQEQTQV 

KQDQVLXEAKLSEVERLATELDRWRVKCNDLErmrQRSNKEHE 

NNTDVI/5KL']m.QDElKJESEQKYNADRKKSaiEEKMMLlTQAKEA 

ENIRNKEMKKYAEDRERFFKXXJNEMEILTAQLTEKDSDLQKWRE 

ERDQLVAALEIQUCAI,ISSNVQKDNEIEQI,KRIISETSKIEIX3I 

MDIKPKRISSADPDKLQTEPLSTSFEISRNKIEDGSWLDSCEV 

STEKDQSTRFPKPELEIQFTPIiQPNKMAVKHPGCTTPVTVKIPK 

ARKRKSNEMEEDbVKCENKKNATPRTNLKFPlSDDRNSSViCKEQ 

KVArRPSSKKTYSLRSQASIIGVNLATICKKEGTLQKFGDFLQHS 

PSILQSKAKKIIETITISSSiajSNVEASKEKVSQPKRAKRKLYTSE 

ISSPlDlSGQVILMDQKMKESDHQIIKRRIiRTKTAK 


5399 


70S 


230 


tiPKKAKFLSQDQIl^YKECFSLYDKQQRGKIKATDLMVi^CLG 
ASPTPGEVQRHLQXHGIDGNGELOFSTFLTIMHMQIKQEDPKKE 

ii;iamij4\a?kekkgyvmasdlrskltslgeklthkev\ddlfre 
Xadiepngkvkydefihkitsyldgty 


5400 


931 


248 


shcssgmeipptnypasraalvaqnyinyqqgtphrvfevqkvk 
qasmedrpgrghkyrlkfavbeiiqkqvkvnctaevlypstgie 
tapevkftpegetgknpdeedntfyqrijksmkeprieaqnlxpdn 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino a.cid segment containing signal pepticLe 
(A=Alanine , C=Cys teine , D-Aspartic Acid , E= 
Glutamic Acid, Fs=Phenylalanine, G=Glycine, 
HteHistidine, Is^lsoleucine, K=^I,ysine, 
L=ilieucine, M=:Methionine, N=Asparagine, 
PaProline, Q=5Glutamine, R^Arginine, 
S«Serine, T^Threonine, V-Valine, 
W^Tryptophan, Y-Tyrosine, X^Unknown, *=Stop 
Codon, /=po3sible nucleotide deletion, ' 
\5=pcssible nucleotide insertion) 








Jr^jNVbPEMTLVIiHLAWVACGYIIWQNSTEDTWYKMVfCIQTVKQV ' 
QRNDDF I ELDYTI LLHMIASQEI I PWQMQVLWHPQYGTKVKH^rs 
RLPKEVQLE 


5401" 


3 


1360 


TGWSYGPTTSLAFLAPRDFPFPPKluLIHPQAWRLSCGAGSMG^"" 

QAAAEWRNWASWEGSSSLSGCSMGCFKDDRIVFWTWMFSTYFME 

KWAPRQDDMIiFYVRRICliAYSGSESGADGRKAAEPEVEVEVyRRI) 

SKKLPGLGDPDIDWEESVCLNLILQKLDYMVTCAVCTRADGGDl 

HIHKKKSQQVFASPSKHPMDSKGEESKISYPNIFPMIDSP\EE\ 

VPSI>ITVGKGEMVCVELVASDKTNTFXX3VIFQGSlRYEALKKVy 

DNRVSVAARMAQK\MSPGFSKYSKMEF\VR\MKaPQGKGHABMA 

VSRVSTGDTSPCGTEEDSSPASPMHBRVTSFSTPPTPERNNRPA 

FPSPSLKRKVPRNRIAEMKKSHSANDSBEFFREDDGGADtiHKAT 

HLRSRSLSGTGRSLVGSWLKLNRADGNFLLYAHLTYVTLPLHRI 

LTDIIiEVRQKPIXiMT 


5402 


3445 


1563 


GECFIMAAWQQNDLVFEFASWVMEDERQLGDPAIPPAVIVEHV' 
PGADILNSYAGLACVBBPNDMITBSSLDVAEEErrDDDDDDITr. 
TVEASCHDGDETl ETI EAAEALLNMDSPGPMIJ5EKRINNNIFSS 
PEDDMWAP VTHVS VTI*DGIPEV1METQQVQEKYADS PGASS PEQ 
PKRKKGRKTKPPRPDSPATXPNISVKKKNKDGKGNTIYLWEFTjL 
ALLQDKATCPKYXKWTQREKGIFKLVDSKPVSRIiWRKHKNKPXD 
MKYEPMGRALRYYYQRG r iiAKVEGQRt.VYQFKEMPKDI*iy INDE 
DPSSS lESSDPSLSS SATSNRNQTSRSRVSSSPGVKGGATTVI^K 
PONS KAAKPKDPVEVAQPSEVtiRTVQPTQSPYPTQLFRTVHWQ 
PVOAVPEGEAARTSTMODETLNS$VQSIR\TIQAPTQVPVWSP 
RNQQ\ liHTVTLQrVPLTTVIASTDPSAGTGSQKFXIiQAl PSSQP 
MTVLKENVMLQSOKAGS PP SXVIjGPARV\QQVJCTSNVQTICNGT 
VSV\ASSPSFS\ATAPWTI.Fi:.LGSSQtiVAHPPGTVITSVIKTQ 
ETK'ZrLTQEVEfCKSSEDHLKENTEKTEQQPQPYVMWSSSNGFTS 
QVAMKQNELLEPMSF 


5403 


3445 


1563 


GECFIMAAWQQNDUVFEFASNVMEDESQLGDPAIFPAVIVEHV 
PGADI LNS YAGIACVEE PNDMITESSLDVAEEEI IDDDDDD ITL 
TVEASCHI^DETrETlEAAEAIJJWDSPGPMIiDEKRINNNlFSS 
PEDDMWAPVTHVSVTLDGIPEVMETQQVQEKYADSPGASSPEQ 
PKRKKGRKTKPPRPDSPATTPNISVKKKNKDGKGNTIYLWEFJUL 
ALLQEIKATCPKYIKWTQREKGIFKI,VDSKPVSRI.WRKHKNKP\D 
Ml^EPMGRALRYYYQRGinAKVEGORLVYQFKEMPXDLiyiNDE 
DPSSS I ESSDPSLSSSATSURNQTSRSRVSSSPGVKGGATTVUC 
PGNSKAAKPKDPVEVAQPSEVLRTVQPTQSPyPTQIiFRTVHWQ 
PVQAVPEGEAARTSTMQDETLNSSVQS I R\TIQAPTQVI»WVSP 

iwqqXlhtvtlqtvplttviastdpsagtgsqkfilqaipssqp 
mwi.kenv^ojqsqkagsppsivlx!parv\qc3viltsnvqticng^ 

VSV\ASSPSFS NATAPWTLFIjLGSSQLVAHPPGTVITSVI ktq 
ETKTLTQEVEKKESE0HLKENTEKTEQQPQPYVMVVSSSNGFTS 
QVAMKQNELLEPNS F 


54 04 


187 


1111 


liPVTLIFAKMKTLQSTLLLLLLVPLIKPAPPTQQDSRIIYPYGT 
ueta itBi* 1 if iQDyBuK.i liDGKNI KEKETVIIPNEKSLQIiQKDEAI 
TPLPPKKBNDEMPTCLLCVCLSGSVYCEEVDIDAVPPLPKESAY 
LYARPNKIICKLT\Aia)FADIPWIJRRrj)FTGNi:,IEDIEDGTFSKL 
SLVEEIiSIAENQIiI^KLPVLPPKItTLFWU^KYiraKSROIKANAFK 
KLNNI»TFLYI»DHNALESVPLNLPESLRVXHIiQFNNIASlTDDTP 
CKANDTSYIRDRIEEIRLEGNPIVLGKHPKSFICLKRLPIGSYF 


54 05 


219$ 


1220 


qnsrslhmdpqnqhgsgsslwiqqpsldsrprldyereiqpta 
ilsldqikairgsneytbgpswkrpaprtaprqekhertheii 
pinvnnnyehrhtshlghavlpsnargpilsrststgsaassgs 

NSSASSEQGUjGRSPPTRPVPGHRSERAIRTQPKQLIVDDLKGS 
LKEDLTQKKFICEQCGKCKCGECTAPRTriPSCLACNRQCU:SAE 
SMVEYGTCMCLWKG I FYHCSNDDEGDSYSDNPCSCSQSHCCSR 

ylcmgamslflpcliicyppakgclklcrrcydwihrpgcrckns 
ntvycklescpsrgqgkps 


5406 


279 


2732 


RWRTYNVBGPLTEWDVAIEPCIiEEWQdiDTAQQNIiYRNVMt^Nf 
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SEQ 
ID 
NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D^Aspartic Acid, E- 
Glutamic Acid, P=sPhenylalanine, G5=Glycine, 
H=Histidine, I=Isoleucine, K=Lysilie, 
L*Leucine, M-Methionine, N=tAsparagine , 
I'=Proline, QcGlutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknovm, *-Stop 
Codon, /^possible nucleotide deletion, 
\spossible nucleotide insertion) 








RNLVFLG/ 1 lAVSKPDLITCLEQEKEPWEPMRRHBMVAKPPVMC " 
SHFTQDFWPEQHI KDP FQKATLRRYKNCEHKNVHUCKDIiKS VpE 
CKVHRGGyNGFNQCLPATQSKIFLPDKCVKAFHKFSNSNRHKlS 
ni Js.juviji' j^i_KJivJljiU3Jb l-WijSHIjAQHAlIHTRVNFCKCEKCGK^ 
NCPS I ITKHKRI NTGEKPYTCEECGKVFNWSSRLTTHKKNYTRY 
KLYKCEECGKAFNKSS ILTTHKI I RTGEKFYKCKECAKAFNQSS 
Nr.TEHKKIHPGEKPYKCEECGKAFNtaPSTLTKHIGlIHTGEKPYT 
CEECGKAFNQFSfcrLTTHKRIHTA\EKFYKCTEOGEAPSRS\SNL 

KCGKAFNCPS I ITKHNRIKTGEKPYTCEECGKVFNWSSRLTTHK 
KNYTRYKLVKCEECGKAPNKSSILTTHECKIHIEKKFYKCEECGK 
APKWSSKLTEHKITHTGEKPYKCEECGKAFNHFSILTKHKRIHT 
GEKPYKCEECGKAFTQSSNLTTHKKIHTGEKFYKCEECGKAFXQ 
SSNIiTTHKKIHTGGKPYKCEECGKAFWQFSTLTXHKIIHTEEKp 
VKCEEOSKAFKl^SSTLTKHKIIHTGEKPYKCEECGNKAFKLSST 
LSTHKIIHTGEKPYKCEKCGKAPNRPSNIilEHKKIHTGEQPYKC 
EECGKAFNYSSHLNTHKRIHTKEQPYKCKECGKAFNQYSNIjTTH 
NKIHTGEKIjYKPEDVTVI I.TTPQTFSNI K 


5407 


3 




RPRRRQSSCCTGWr*AGWLLRAAPRFCRRTETDMEQGKGLAVI*rL 
AllLLQGTLAQSrKGNHLVKVYDYQEDGSVJbliTCDAEAKMITWF 
KDGKMIGFLTEDKKKWNLGSNAKDPRGMYQCKGSQNKSKPLQVY 
YRMCQNCIELNAATISGFLFAEIVS I FDLAVGVYFIAGTGMEFR 
QS\RASDKQTrJC>P\NDPAPTQPLKDPRKMTQy$HLQC3N\QIiRRN 


540B 


274S 


6128 


QGSKGTCHPQAQQPWDEGVWQEAPSQSEPWGQSQEPPXMPQRIiP 
HARQHTPIjPLGSADYRRWS VRPOGPHRDPKDSRDAAKREQGS L 
APRPVPASRGGKTIjCKGYRQAPPGPPAQFQRPICSASPPWASRF 
STPCPGGAVREE>TYPVGTQGVPSLALAQGGPaGSWRPI*EWKSMP 
RLPTDLDrGGPVJFPHYDFERSCWVRAISQEDQLATCWQAEHCGB 
VRNKDMSWPEEMSFIANSSr'rDRHKVPTEKGATGIjSmiiGNTCFM 
NSSIQCVSNTQPLTQVFISGRHLYELNRTNPIGMKGHMAKCYGD 
LVQELWSGTQKNVAPIiKLRWTIAKYAPRFNGFQQQDSQELIAFL 
LDGLHEDUJRVHEKPYVELKDSDGRPDWEVAAEAVromnJRRNRS 
IWDLFHGQIiRSQVKCKTCGHrSVRFDPFNFIiSLPLPMDSYMHL 
EITVIKIiDGTTPVRYGLRLWWDEKYTGI.KKQIjSDLCGLNSEQII, 

TQTDFSSSPSTNEMFTLTTNGDLPRPIFIPNGMPirrWPCGTEX 
NFTWGMVNGHMPSl-PDSPFTGYlIAVHRKMMRTELYPIiSSQKNR 
PSLFGMPLIVPCTVHTRKKDLYDAVWIQVSRtASPIiPPQEZVSNH 
AQDCDDSMGYQyPFTIiRVVQKDGNSCAWCPWYRFCRGCKIDCaE 
DRAFIGNAYIAVDWHPTALHIiRVQTSQERWDEHESVBQSRRAQ 
VEPINLDSCLRAFTS EEEUJENEMYYCSKCKTHCtATKMHiVm 
LPPILIIHLKRPQFVNGRW I K5QKI VKFPRESFDPSAFLVPRDP 
ALCQHKPLTPQGDELSEPRIXiAREVKKVDAQSSAGEEDVLLSKS 
PSStiSANIISSPKGSPSSSRKSGTSCPSSKKSSPNSSPRTLGRS 
KGRLRLPQXGSKNKLSSSKENLDASKEWGAGQICELADALSRGH 
VLGGSQPELVTPQDHEVALANGFLYEHEACGNGCGNGYSNGQIKS 

nhseedstddqredtrikpiyklyaischsgilggghyvtyaku 
pmckwycymdssckelhpdeidtdsayilfyeqqgidyaqpiipk 
tdgkkmadtssmdedfesdyXekycvlq 


5409 


2745 


6128 


qgskgtchpqaqqpwdegVwqeApsqsbpwgqsqepptmpqrlp 
harqhtplplgsadyrrwsvrpqgphrdpkdsrdaakreqgsl 
aprpvpasrggktlckgyrqappgppaqfqrpicsasppwasrf 
stpcpggavredtypvgtqgvpsiiariaqggpqgswrflewksmp 
rlptdldiggpwfphydferscwvraisqedqlatcwqaehcge 

VRNKDMSWPEEMSFIANSSKIDRHKVPTEKGATGbSNLGNTCFM 

NSSIQCVSNTQPLTQYFISGRHIiYELNRTNPIGMKGHMAKCyGJ) 

LVQELiWSGTQKNVAPLKLRWTIAKYAPRFNGFQQQDSQELLAPl, 

LtK3L:-tEDLNRVHEKPYVELK33SIX5RPDWEVAAEAWDNHLRI?NR 

IWDLFHGQl^SQVKCKTCGHISVRFDPFUFLSLPLPMDSYMHL . 

EIT7XKt,DGTrPVRYGLRIiNMDEKYTGriKKQI#SDIiCGLNSEQI< 

LAEVHGSNIKNFPQDNQKVRLSVSGFIiCAFEIPVPVSPISASSP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re spond i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cy'steine, D=sAspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G»*Glycine, 
H«Histidine, I»l6oleucine, K^Lyaine, 
L=Leucine, M'^'Methionine, N=^Asparagine, 
P=Proline, Q^Glutamine, R-Arsinine, 
S=Serine, T=Threotiiae, V=Valine, 
w=Tryptophan, ys= Tyrosine, X=Un3cnovm, *=Stop . 
Codon, /^possible nucleotide deletion, 
\s;possible nucleotide insertion) 








TQTDFSSSPSTNEMFTLTTNGDLPRPIFIPNGMPNWVPCGTEk 
NFTNGMVNGHM PSLPDS P FTGYI I AVHRKWMRTELYFliSSQKtW 
PSLFGMPIilVPCTVHTRKKDLVDAVWIQVSRIiASPLPPQEASNH 
AQDCDDSMGYQYPFTLRWQKDGNSCAWCPWYRFCRGCKIDCGE 
DRAFIGNAYIAVDWHPTALHLRYQTSQERWbEHESVEQSRRAQ 
VEPIWLDSCLRAFTSEEELGENEMYYCSKCKTHCIATKKIJJLWR 
LPPI L 1 IKLKRFQFVNGRW I KSQKI VKPPRES FDPSAFLVPRDP 
ALCOHKPLTPQGDELSKPRlLARBVKKVnAQSSAGEEDVUiSKS 
PSSLSAKIISSPKGSPSSSRKSGTSCPSSKNSSPNSSPRTLGRS 
KGRLRLPQIGS KNKLSSS KENLDASKENGAGQ ICEL7VDALSRGH 
VLGGSQPEIiVTPQDHEVAIiANGPLYEHEACGNGCGNGYSIKSQLG 
NHSEEDSTDDQREDTRI KP I YNLYAISCHSGILGGGHY VTYAKN 
PNCKVfYCYNDSSCKEIjHPDEIDTDSAYILPYKQQGIDYAQFLPK 
TDGKKMADTSSMDEDFESDY\EKYCVLQ 


S4i0 


2 


710 


UIPPGQARHVWIJIARMQAPHKEHLYKLI»VIGDI*GVGKTSIIKR^ 

VHQNPSSHYRATIGVDFAiaCVJjHWDPETWRLOliWDIAaQBRFO 

HMTRVYYREAMGAPrVPDVTRPATFEAVAKWKNDIiDSKLSLPNa 

KPVSWLLANKCDQGKDVLMNNGIiKMDQFCKEHGFVGWFETSAK 

ENIKIDEASRCLVKHILANECDLMESIEPDWKPHLTSTKVASC 

SG\CAKIIiVGTFAGVW 


5411 


1302 


289 


TGPAAAGRRKAIX3S FGKPS P VTGLRAARRRRTRPSAPAAPS VGC 
GKRRBSDAGAGGERASVRTGSGRRGGRTMAGDSEQTLQNHQQ?N 
GGEEFtilGVSGGTASGKSSVCAKIVQLIiGQNEVDYRQKQWILS 
QDSFYRVI,TSEQKAKALKGQrWPDHPDAFDNEIiIIfKrLKEITBG 
KWQI5>VYDFVSHSRKEETVTVYPADVVI.FEGI tAFYSQBR/ IR 
DLFQMKLFVDTDADTRliSRRVLKDISERGRDliEQIIiSSSTLRFV 
KPAXFEEFCLPPKVKYADVI I PR\GADN\RVPINLI VQHIQ\DI 
bNGGPS \NRQTNGCLNGYTPSRKRQASESSSRPH 


5412 


3180 


313 


QGISNFFHKEANFWFEVSGYI>ISPIiRSPFVDPALEWSLMASP{?If 
KMEGESSRPEIHTPVSDKKKKKCSIHKERPQKHSHEXFRDSSLV 
NEQSOITRRKKRiCKDFQHLISSPIiKKSRICDETANATSTLKKRK 
KRRYSALEVDEBAGVTVVLVDKENINNTPKHFRKDVDVVCVDMS 
IBQKLPRK\PKTDKFQVIAKSH\AHKSEAIiHSKVREKKNKKHQR 
KAASWESQRA\RDTLPQSEFPTQBBSWLSVGPGGEITEI,P\ASA 
HKNKSKKKKKKSSNREYET\IiAMPEGSQAGREAGTDMQESQPTV 
GI^DDETPQIil^PTOKKKSKKKKKKICSNHQEFESIiAMPEGSQVGS 
EVGADMQBSVRPAVGLHGETAGIPAPAYKNKSKKKKICKSNHQEF 
EAVAMPESLESAYPEGSQVGSEVGTVBGSTALRGFKBSWSTKKK 
SKKRKIiTSVKRARVSGDDFSVPSJCNSBSTIiFDSVEGI>GAMMEEG 
VKSRPRQKKTQACIiASKHVQEAPRi;,EPANESHNVETAEDSEIRY 
LSADSGDADDSDADIiGSAVKQLQEFXPNIKDRATSTIKRMYRDD 
LERFKEFKAQGVAIKFGKFSVKENKQLEKtrVEDFIALTGXESAD 
KLLYTDRYPEBKSVirWLKRRYSFRJLHIGVRNrARPWlCLlYYBA 
KKMPDVWNYKGRYSEGDTEKLKMYHSLtiGNDWKTIGEMVARRSL 
SVALKFSQISSQRNRGAWSKSETRKLlKAVEEVlItKKMSPQEIjK 
EVDSKIiQEMPeSCIiS 1 VREKLYKGISWVEVEAKVQTRWWMQCKS 
KWrEILTKRMTWGRRIYYGMKALRAKVSLrERLYEINVEDTNBI 
DWEDLAS Al GDVP PS YVQTKFS RLKAVYVP FWQKKTFPE I ID YI* 
YETTLPLI*KEICIiEKMMEKKGTKIQTPAAPKQVFPPRDI FYYEDD 
SBGGGHRKRKRRPRRHAWFTPVIPVLWEAKAGWII 


5413 


3753 


1304 


RPPAGVAPRRAMANVS KKVS WSGRDRDDE^PLLRRTArpGGG 

TPLLNGAGPGAARQSpRSAIiFRVGHMSSVKIiDDELJjEPVDMDPP 

HPFPKE I PHNEKLIiSLKYESLDYDNSENQLFLEEERRINHTAFR 

TVEIKRWVICALlGILTGLVACFlDIWENIiAGLKYRVIKGNID 

KFTEKGGLSFSIiLLWATLNAAFVLVGSVIVAFIEPVAAGSGIPQ 

IKCFLNGVKIPHWRLKTLVIKVSGVILSWGGLAVGKEGPMIH 

SGSVIAAGXSQGRSTSLKRDFKIFEYLRRDTEKRDFVSAGAAAG 

VSAAFGAPVGGVLFSLEEGASFVINQFLTWRIFFASMISTFTLNF 

VI,S r YHGNMWDDSS PGMNFGRFDSEKMAYTIHEIPVFIAMGW 

GGVXiGAVFNALNYWLTMFRIRYIHRPCLQVI EAVIiVAAVTATVA'^' 

FVLIYSSRDCQPLQGGSMSYPIiQLFCADGEYWSMAAAFFNTPEK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cyst:eine, D=rAspartic Acid, E- 
Glutamic Acid, F=PhenyX alanine, G-Glycine, 
H=llistidine, I=lsoleucine, K^Lyaine, 
L=Leucine, M=Methionxne, N«Agparagine, 
P« Proline, O-Glutamine, R=Arginine, 
S^Serine, T=Threonine, V^Valine, 
W=TrYptophan, y»=Tyrosine, X^Unknown, ♦=Stop 
Codon, /=spossible nucleotide deletion, ' 
\j=possible nucleotide insertion) 








SWSI>FHDPPGSYNPLTLGLFTLVYFFLAGWTVGLTVSAGVFIP~* 

SLLZGAAWGRLFGISLSYLTGAAIWADPGKYALMGAAAQLGCnv 

RMTL5LTVIMMEATSNVTYGFPIMLVLMTAKIVGDVFIEGLYDM 

HIQLQSVPFLHWEAPVTSHSLTARBVMSTPVTCriRRREKVGViv 

DVLSDTASNHNGFPWEHADDTQPAREiQGLILRSQLIVLLKHKV 

FVERSMU3I,VQRRLRLKDFRDAYPRFPP IQS IHVSQDERECIM) 

LSEFM^rPSPyTVPQEASLPRVPKl,FRALGLRHLVWDNRNQVVG 

LVTRKDLARYRLGKRGLEELSIAQT 


5414 


2130 


390 


GVASAMDRALFSPLLSPTSRVFRTSPPRCVSTETGRRDRARVPS" 

QWCSVLQGKIiPVSGRTSUVCVRSILLSPASSPRKVGIVGGTGAR 

AGAAPRDHGRVRHRRPSSARRMTRTTGQCIxAPRGCQGPRGTRSP 

RSPRSRTRRGCSASPACUP/CRSALIVAVLCyiNriLNYMDRPTV 

AGVLPDIEOFFN IGDSSSGLIQTVPI SSYMVLAPVFGYLGDRYN 

RKYIJylCGGlAFWSLVTLGSSFIPGEHPWLIiLLTRGLVGVGETVSY 

STIAPTLIADLFVADQRSRMLSIFYFAIPVGSGIiGYIAGSKVKD 

MAGDWHWALRVrPGLGWAVLI^FLWREPPRGAVERHSOLPPL 

NPTSWWADLRALARNPSFVLSSLGPTAVAFVTGSIALWAPAFrJb 

RSRWLGETPPCLPGDSCSSSDSLIFGLITCLTGVLGVGLGVEI 

SRRIiRHSWPRADPLVCATGLLGSAPFLFLSIACARGSlVATYIF 

IPIGEnil/SMNWAIVADILLYWIPTRRSTAEAFQIVtiSHLLGD 

AGSPYLIGlI^ISDRIiRRNWPPSFLSEFRALQFSLMLCAFVGALGG 
AAFLGTAHLH 


S41S 


693 


2986 


IPPKTKLELQKHXLTTLTXNQEQATIFBBVQKLRPRNEQRBNEI, " 
IISFLRCLPEEKQKBHIHIGEMKQTSQMAAENIGSETiPPSATRF 
RIiDMLKNKAKRSliTESLESILSRGHKARGLQEHSXSVDLDSsr,S 
STLSNTSKEPSVCEKEAIjiPISESSPifT.T.r;<:<iTPnT Qcneirctr- -nc- 

EPAPI*SPQQAFRRRANTI,SH?PIECQE?PQPARGSPGVSQRKiiM 
RYHSVSTETPHEUKDFESKANHZiGDSGGTPVKTRRHSWRQQIFL 
RVATPQKACDSSSRYEDYSELGELPPRSPLEPVCEDGPPGPPPE ^ ^ 
EICKRTSRELRELWQKAILQQH,I*LRMEKENQKLQASENDIiI.NKR 
LKIJDYEBITPCLKEVTTVWEKMLSTPORSKIKFDMEKMHSAVGQ 

gvpVrhhrgbiwkplaeqfhlkhqppskqqpkdvpykellkqlt 

SQQHAILIDIiGRTFPTHPYFSAQLGAGQLSLYNILKAYSLLDQS 

vgycqglspvagilllhmseeeafkmucflmfdmglrkqyrpdm 

IILQIQMYQLSRIJ[jHDYHRDLyNHl,BEHEIGPSI.YAAPWFLTMP 
ASQFPl.GFVARVFDMIPL<K3TEVIFKVAriSLLGSHKPLILQHEN 

letivdfikstlpklglvqmektinqvfemdiakqlqayeveyk 
vlqeei*idssplsdnqrmdklektnsslrkc»jldi,leqlqvang 
riqsleatibkli»ssesklkqamltlelersaiilqtvebttrrrs 
akpsdrepectqpeptgd 


5416 


27 


4074 


ksqlfcfwggkagdilsgdqdkeqkdpyfvetpygyqldldfi.k'" 
yvddiqkgntikrlniqkrrkpsvpcpeprttsgqqgiwtstes 
r.sssnsddnkqcpnpliarsqvtstpt skpppplets lpflti p 
enrqlpppspqlpkhnlhvtktlmetrrrleqeratmqmtpgef 

RRPRLASFGGMGTTSSLPSPVGSGNHNPAKHQLQNGyQQNGDYG 

syapaapttssmgssirhsplssgistpvtsvspmhlqhirbqm 

AIAIiKRLKELBEQVRTIPVLQVKISVLQEEKRQLVSQLKNQRT^ 

sqinvcgvrkrsysagnasqleqlsrarrsggelyidyeeeeme 

TVEQSTQRIKEFRQLXTADMQALEQKIQDSSCEASSELRENGEC 
RSVAVGAEEWMtJDIWYHRGSRSCKDAAVGTXiVEMRNCGVSVTE 
JmLGVMTEADKE lELOQQTI ESLKEKI YRLEVQLRETTHDREMT 
KLKQELQAAGSRKKVDKATMAQPLVFSKVVEAWQTRDQMVGSH 
MDLVDTCVGTSVETNSVGISCQPECKNKWGPBLPMNBWIVKER 
VEMHDRCAGRSVEMCDKSVSVBVSVCETGSNTEESVNDLTIiIiKT 
NLtOiKEVRSIGCGIXrSVDVTVCSPKECASRGVNTEAVSQVEAAV 
MAVPRTADQDTSTDLEQVHQPTKTETATLIESCTNTCLS'TLDKQ 
TSTQTVETRTVAVGEGRVKDIKSSTKTRSIGVGTXiLSGHSGFDR 
PSAVKTKESGVGQININDNYLVGLKMRTIACGPPQLTVGLTASR 
RSVGVGDDPVGESLENPQPQAPLGMMTGLDHYIERIQKLbAEQQ 
TLLAENYSEIAEAFGEPHSCX^GSLUSQLlSTIiSSINSVMKSASt 
EEIiRWPDFQKTSt;3KITGSYIiGYTCK0GGLQSGSPI.SSQTSQPE 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re spond i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A"Alanine, C^Cysteine, D=Aspartic Acid, Ea 
Glutamic Acld^ F=Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K= Lysine^ 
L=Laucine, M=Methionine, N=»Asparagine» 
P=Proline, Qt=Glut amine, R-Arginine, 
S -Serine, T-Threonine, V=Valine, 
W=rryptophan, Y=Tyrosine, X=«Unknown/ ♦sStop 
Codon, /^possible nucleotide deletion, * 
\=possible nucleotide insertion) 








QEVGTSEGKPISSLDAFPTQEGTLSPVWLTODQIAAGLYACTNN 
ESTLKSIMKKKDGNKDSNGAKKNLQFVGINGGYETTSSDDS^SD 
ESSSSESDDEGDVIEYPLEBEEEEEDBDTRGMAEGHHAVNIEGIj 
KSARVEDEMQVQECEPEKVBIRERYELSEKMDSACNLLKNTIND 

pkaltskdmrfclntlqhewfrvssqksaipamvgdyiaafeai 

SPDVLRYVlNLADGNGNTALHYSVSHSNFEIViCLLI*DADVCNVD 
HQNKftGYTPIMLAAXiAAVEAEKDMRIVEELFGOGDVNAKASQAG 
QTAI/flOWVVSHGRIDMVKGLrACGAOVWlQDDEGSlAl.MCASEHG 
HVEIVKLLLAQPGCNGHLEDNDGSTALSIALEAGHKDIAVIjLYA 
HVNFAKAQS PGTPRLGRKTS PGPTHRGS FD 


5417 


~ 27 


4074 


KSQIiPCFWGG:<AGDILSGDQDKEQKDPyFVETPYGYQLDIiDFX*K~" 

YVDDIQKGNTIKRLNIQKRRKPSVPCPEPRTTSGQQGIWTSTES 

LSSSNSDDNKQCPNFLilARSQVTSTPISKPPPPLETSLPFLTIP 

ENROLPPPSPQr.PKHNLHVTKTLMETRRRLEQERATMQMTPGBF 

RRPRLASFGGMGTTSSLPSFVGSGNHNPAKHOLQWGYQGNGDYG 

SYAPAAPTTSSMGSSIRHSPLSSGrSTPVTNVSPMHIiQKIREQM 

AIAIiKRLKEI*EEQVRTIPVLQVKISVI.<>EEKRQriVSQLKNQRAA 

SQINVCGVRKRSYSAGNASQLEQIiSRARRSGGELYlDYEEEEME 

TVEQSTQRl KEFRQr,\TA3MQAI»EQKIQDSSCEASSEI*RENaEC 

RSVAVGAEENI^INDIWYHRGSRSCKDAAVGTLVEMRNCGVSVTK 

AMLGV^rrEADKEIELQQQTIESLKEKIYRLEVQUiETTHDREMT 

KLKQELOAAGSRKKVDKATMAQPLVFSKVVEAVVQTRDQMVGSH 

MDIiVDTCVGTSVETNSVGISCCPECKNKVVGPELPflNrWWXVKER 

VEMHDRCAGRSVEMCDKSVSVEVSVCETGSNTBESVKDIjTLLKT 

NLNLKEVRS IGCGDCSVDVTVCS PKECASRGVKTEAVSQVEAAV 

MAVPRTADQDTSTOLEQVHQFTWTETATLIBSCTNTCXSTIiDKQ 

TSTQT VETRTVAVGEGRVKD INSSTKTRS IGVGXCiLSGHSGFDR 

PSAVKTKESGVGQININDNYLVGIiKmTlACGPPQI»TVGLTASR 

RSVGVGDDPVGESIjBNPQPQAPLGMMTGLDHyiERlQKI,I*ABQQ ^ 

TLLAENYSELAEAFGEPHSQMGSLMSQLISTLSSINSVMKSAST 

EELRJJPDFQKTSUSKITGSYLGYTCKCGGLQSGSPLSSQTSQPE 

QBVGTSEGKPXSSLDAFPrQEGTIiSPVNLTDDQIAAGIiYACTNN 

ESTLKSIMKKKDGNKDSNGAKKNLQPVGINGGYETTSSDDSSSD 

ESSSSESDDECDVIEYPIiEEEEEEEDEDTRGMAEGHHAVNIEGL 

KSARVBDBMQVQBCEPEK^^IRERYELSEK^!Il5ACNt*LKI^TIND 

PKAIiTSKDMRFCLNTLQHEWFRVSSQKSAIPAMVGDYIAAFEAI 

SPDVLRYVINLArH3NGNTAliHYSVSHSOTEIVKl>I.UD^U)VCNrVD 

HQNKAGYTPIM1AAIAAVEABKDMRIVEELFGCX5DVNAKASQAG 

QTALMLAVSHGRIDMVKGLIACGADV^aQDDEGSTAIWCASEH^ 

HVBIVKLIJ^OPGCnSIGmjEDNDGSTALSIALBAGHKDIAVI^LYA 

HVNFAKAQSPGTPRLGRKTSPGPTHRGSFD 


5418 


24 


1133 


SVPRAGGDMETGAAELYDQALLGILQHVGNVQDFLRVI.FGFLYR 
KXDFYRLIiRHPSDRMGFPPGAAQALVLQVFKTFDHMARQDDEKR 
RQKr.EKKIRRKEEEEAKTVSAAAAEKEPVPVPVQErEIDSTTBL 
DGHQEVEICVQPPGPVKEMAHGSQEAEAPGAVAGAAEVPR\EP?I 
LPRIQEQPQKNPDSYNGAVRENYTWSQDYTDLEVRVPVPKHWK 
GKQVSVALSSSSIRVAMI.EEKGERVLM3GKLTHKINTESSLMSL 
EPGKCVLVNLSKVGEYWWNAItiEGEEPIDIDKlNKERSMATVDE 
EEQAVLDRIiTPDYHQKL^KPQSHELKVHEMIiKKGWDAEGSPFR 
GQRFDPAMEWISPGAVQF 


5419 


1395 


259 


GtHPLDPDLVSRTS VQGP IiMTMACPGMSDt EESPFIjGPRAAEEG 
SESEACEAFGRRKSEEEGRRSDTSGFGRSRKHKVNWKHPERADA 
KDPASLPQC/LGP/IX!VRPAQPSSKYCSDIXX3MKI*AANRIYEIL 
PQRIQQWQQSPCIAEEHGKKLbERIRREQOSARTRIiQEMERRFH 
EriEAIILRAKQQAVREDEESNEGDSDDTDLQIFCVSCGHPINPR 
VAIiRHMERCYAKYESQTS FGSMYPTRIEGATRLFCDVYNPQSKT 
YCKRLQVLCPEHSRDPKVPADBVCX5CPLVRDVFELTGDFCRLPK 
ROCNRHYCWEKIJEUlAEVDLERVRVwyJCLDELFEQERNVRTAMTN 
RAGLLALMLHQTIQHDPLTTDLRSSADR 


5420 


117 


1733 


NEAGGAGP FKGGASGRL YXjSPRLPRVS VAGCEERPl»GWVWVl>a|^ 
GGFLPARPPRAQRHUSFSHAJlQSMEAPDYEVIiSVREQIiFHERIR 
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ID 
NO; 


Precdcted 
beginniag 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
re£(idue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A'^Alanine, C=tCysteine, D^Aspartie Acid, E= 
Glutamic Acid, F«Phenylalanine, G-Glycine, 
H-Histidine, I==Isoieucine, K**Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P^'Proline, Q^Glutamine, R-Arginine, 
S«Serine, T«Threonine, V=Valine, 
W-Tryptophan, Y^Tyrosine, X=Unknown, *=.Stop 
Codon, /-possible nucleotide deletion, ' 
\«possible nucleotide insertion) 








ECIISTLLFA?I.YILCHlFU'UFKKPAEFTT\GMMKrt^PSTRL/ 
LLELCTFTLAIALGAVLLI.PFSHSN3VI,LSLPRNYYIQWLltGS 
LIHGLWNLVFLPSNI/SLIFLMPFAYFFTESEGFAGSRKGVLGRV 
YETVVMl^LLTlJ^VlKSMVWVASAIVDKNKANHESLyDFWEYiTjP 
yLYSCISFLGVLriLLVCTPLGUVRMFSVTGKLLVKPRIiLEDLEE 
QLYCSAFEEAALTRRICNPTSCWLPLDMELLHRQVliAI^JTQRVL 
LEKRRKASAWQRNLGYPIAMLCLLVLTGLSVLIVAIHILELLID 
EAAMPRGI^TSLGQVSFSXLGSFGAVIQVVLlFyLMVSSVVGF 
YSSPLFRSLRPRWHDTAMTQIlGNGVCUt.VLSSALPVFSRTLGL 
TRFDLLGDPGRPNWLGNFYrVPLYNAAPAGLTTIiCLVKTFTAAV 
RABLXRAFGERE 


5421 


117 


1733 


NEAGGACPFKGGASGRLYLSPRLPRVS'VTiGCEERPtiGVmJVLGG 
GGFLPARPPRAQRHtGFSHAEQSMEAPDYEVLSVREQLFHERIR 
ECIlSTLbFATLYILCHIFLTRFKKPAEFTT\GMMKMPPSTRL/ 
LLELCTFTLAIALGAVLLLPFS IISNEVliLSLPRNYVlQWIJTGS 
LIHGLWNLVFLFSNLSLIPLMPFAYPFTESBGFAGSRKGVLGRV 
YETVVKLiyOjLTLliVl^MVWVASAIVDKinCANRESLYDFWEYYL^ 
YLYSCISFLGVLLLWCTPliGLARMFSVTGKLLVKPRIiliEDLEE 
QLYCSAFEEaVAIiTRRXCNPTSCWTiPLDMELLHRQVLALQTQRVL 
IiEKRRXASAWQRIfriGYPItAWI,CX;IiVLTGI*SVLlVAIHItiEI*I,ID 
EAAMPRGMQGTSLGQVSFSKliGSFGAVIQWLIPYLMVSSWGP 
YSSPLFRSLRPRWHDTAMTQIIGNCVCLLVLSSALPVFSRTLGL 
TRFDLLGDFGRFNWLGNFYIVFLYNAAPAGLTTLCLVKTFTAAV 
RAELIRAFGERE 


5422 


3 


1263 


SCGESLPTWlAGASRPaiGRKGGAWGGR(3GSSPAQVIiLSPGPVF 

KAGCNWWHLSRDQAGVQRCDIiGSSQPPPLGFKRFSCIjSLPSSWD 

YRSTVLCVSKMEABLSGFKXDAPRWDQRTFJjGRVKHFLNITDPR 

TVFVSEREXJDKAKVMVEKSRMGWPPGTQVEQI.T.YAKKJbYDSAF 

HPDTGEKMNVIGRMSFQIiPGGMIITGFMLQFYRTMPAVIFWQWV ? , 

NQSPNALVNYTNRNAASPTSVRQMALSYFTATTTAVATAVGMNM 

IiTiCKAPPLVGRMVPPAAVAAARCVNI PMMRQQBI,XKGICVKDRN 

BNEIGHSRRAAAIGlTQWISRITMSAPGMILIiPVrMERLEKLK 

FMQKVKVL/SAPLQVMI»SGCFLIFMVPVACGLFPQKCEI.PVSYI:» 

EPKLQDTIKAKYOELEPYVYFNKGL 


5423 


3186 


905 


GVSMALGEEKAEAEASEDTKAQSYGRGSCREREUJIPGFMSGEQ 
PPRLEAEGGLISPVWGAEGIPAPTCWIGTDPGGPSRAHQPQASD 
ANREPVAERSEPALSGLPPATMGSGDLLLSGESQVEKTiaaSSSE 
EFPQTLSLPRTTICSGHDADTEDDPSI»;U>LPOALDLSQQPHSSG 
LSCLSQWKSVLSPGSAAQPSSCSISASSTGSSLQGHQERAEPRG 
GSliAKVSSSIiEPWPQBPSSWGriGPRPOWSPQPVFSGGnASGIt 
GRRRLSFQAEYWACVLPDSI»PPSPI3RHSPXiWNPNKEYEDriI.DYT 
YPLRPGPQIjPKKLDSRVPADPVLODSGVDLDSFSVSPASTLKSP 
TNVSPNCPPAEATALPFSGPREPSLKGfWPSRVPQKQGGMGLASW 
SQr*ASTPRAPGSRnARWERREPALRGAKDRLTIGKHLDMGSPQL 
RTRDRGWPSPRPEREKRTSQSARRPTCTESRWKSEEEVESDDEY 
lALPARLTQVSSLVSYLGSISTLVTLPTGDlKGQSPIiEVSDSDG 
PASFPSSSSQSQLPPGAALQGSGDPEGQNPCFLRSFVRAHDSAG 
Eivso tA»o ou/vLivj vi> b^ltij icrRPS IjPARIiDRWPFSDPDVEGQL P RK 
GGEQGKESLVQC\VKTFC\CQLEErirCWLyNV\ADVTDHGTPAR 
SNLTSLK\SSIXJLYRQFKKDIDEHQSIiTESVLQKGEILLQCr*LE 
NTPVLEDVI/SRIAKQSGELESHADRLYDSIlAASLDMIJ^CrrLIP 
DKKPMAAMEHPCEGV 


5424 


3186 


905 


GVSMALGEEKAEAEASEDTKAQS YGRGSCREREIiDIPGPMSGBQ * " 

PPRliEAEGGLlSPVWGAEGlPAPTCWIGTDPGGPSRAHOPQASD 

ANREPVAERSEPAr,SGLPPATMGSGDIiIiI*SGESQVEKTKIiSSSE 

EFPQTDSLPRTTICSGHDADTEDDPSIADLPQALDLSQQPHSSG 

LSCOSQWKSVI.SPGSAAQPSSCSISASSTGSStiQGHQERAEPRG 

GSIAKVSSSLEPWpQEPSSWGLGPRPQWSPQPVPSGGDASGL 

GRRRLSFQAEYWACVLPDSLPPSPDRHSPLWNPNKEYEDLLDYT . 

YPLRPGPQLPKHLDSRVPADPVLQDSOVDLDSFSVSPASTtiKSf 

rNVSPNCPPAEATALPFSGPREPSLKQWPSRVPQKQGGMGIASW 
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SEO 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding- 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=*Alanine, C*:Cysceine, D=sAspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G"Glycine, 
H=Histidine, I«Isoleucine, K«Lysine, 
L«Leucine, M»Methionine, N=Asparagine, 
PnProline, Q=Glutamine, R«Arginine, 
SasSerine, T-Thrconine, V-Valine, 
W=Tryptophan, Y^Tyrosine, X-Unknown, *=Stap 
Codon, y:=po5sible nucleotide deletion, ' 
\sspossible nucleotide insertion) 








SQU^TPRAPGSRDARWERREPALRGAKDRLTIGKHLDMGSPQL 
RTRDRGWPSPRPBRBKRTSQSARRPTCTESRWKSEEEVESDiSfeY 
LAbPARLTQVSSLVSYIiGSISTliVTLPTGDIKGQSPLBVSDSDG 

pasppssssqsqlppgaabqgsodpegqnpcfiirsfvrahdsag 
egslgssqalgvssgllktrpslparldrwpfsdpdvegqlprk 
ggeqgkeslvqcWktfcVcqleelicwlynvXadvtdhgtpar 
snltslk\sslqliyrqpk:<dldehqsltesvlqkgeili,qciile 
htpvledvlgriakosgeiieshadrlyts iiasldmlagctlip 
dkkpkaambhpcegv 


5425 


1086 


lis 


GFCPSPSI/3HQPPRV1.HPTMSMAVETFGFFMATVGLLMI*GVTLP~ 
NSY WRVSTVHGNVITTNTI FENLWFSCATDS LGVYWCWEFPSMIj 
AliSGYIQACRALMITAILLGFLGLLLGIAGLRCrmiGGIiELSRK 
AKLAATAGAPH\ IIiPGICGMVAI \SWYAFNITR\DPSDPLYPGT 
KYELGPALYLGWSASLl S I LGGIiCLCSACCCGSDEDPAASARRP 
YQAPVSVMPVATSDQEGDSSFGKYGRNALRVAAtiCRGPRCLPTA 
PKKRGPGRGPFPYSNLRGRPRPVPVAPPRPRPRVLHSHGPSQAK 
NCSWEVAYLPSEAGSLIF 


5426 


42 


3435 


ATSSQSIiGRADPPRGGTMERSPGEGPSPSPMDQPSAPSDPTDQP 

PAAHAKPDPGSGGQPAGPGAAGEAIiAVLTSFGRR£,UVLI PVYLA 

GAVGLSVGFVIiFGLAUYIiGWRRVRDEKEKSLRAARQLLDDEEQL 

TAKTLYMSKRELPAWVSFPDVEKAEWI*NKIVAQVWPFLGQYMEK 

LLAETVAPAVRG SNPHLQTFTFTRVELGEKPLRI IGVKVHPGQR 

KEQILLDLNISYVGDVQIDVEVKKYPCKAGVKGMQUlGVLRVIIi 

EPLIGDIjPFVGAVSMFFI RRPTLD INim3MTNriIjDIPGI.SSX,SD 

TMIMDS lAAFLVLPNRLLVPI^VPDLQDVAQLRSPlipRGI IRIHL 

LAARGIiSSKDKYVKGrjIEGKSDPYALVRLGTQTFCSRVaDEELN 

PQWOETYEVMVHEVPGQElEVEVFDKDPDKDDFIiGRMKIiDVGKV 

IjOASVliDDWFPLQGGQGQVM£iRDBWJiSL7jSDAEKI*EQVLQWNWG 

VS S RPDP PSAAIli WYLDRAQDLPMVTSELYP PQLKKGNKE PNP ' 

MVQLS IQDVTQES KA^^ ' <5 1 NCPVWEEAFRFPLQDPQSQELDVQV 

KDDSRALT]:<5ALTLPLARLLTApEIiILDQWPQi:iSS3GPNSRLyM 

KLVNR ILYLDSSE ICFPTVPGCPGAWPVDSENPQRGSSVDAPPR 

PCHTTP0SQPGTEHVLRIHVI,EAQDLIAKDRFLGGLVKGKSDPY 

VKLKLAGRSFRSHWREDItWPRWNEVFEVIVTSVPGQEriEVEVF 

DKDLDKDDFIiGRCKVTU.TTV3:*NSGFXiDEWIiTLKDVPSGRIiHLRt* 

ERLTPRPTAAELEEVLQVNSLIQTQKSAEUUVALLSIYMERAED 

LPLRKGTKHl^PYATLTVGDSSHKTK'TISQTSAPVWDESASPI*! 

RKPKTESI*ELQVRGEGTGVliGSLSLPLSEI»LVADQIjCtjt)RWFTI« 

SSGQGQVULRAQLGILVSQHSGVEAHSHSYSHSSSSr^SEEPEtjS 

GGPPHI TSS APEV\RQRLTHVDSPI»EAPAGPliGQVKLXLW YYSE 

ERKLVSIVHGCRSI^QNGRDPPDPYVSIiLLLPDKMRGTKRRTSQ 

KKRTLS PEFNERFEWEIiPLDEAQRRKLDVS VKSNSS FMSREREli 

LGKVQLDIAETDLSQGVARWYDLMDNKDKGSS 


5427 


42 


3435 


AXSSQSLGRADPPRGGTMERSPGEGPSPfiPMDQP^APSriPTDQP 
PAAHAKPDPGSGGQPAGPGAAGEAIiAVLTSFGRRIiLVLIPVYIA 
GAVGXiSVGFVIiFGLALYLGVJRRVRDEKERSIiRAARQriLDDEEQIi 
TAKTIiYMSHRELPAWVSFPDVEKAEWLKFKIVAQVWPPIiGQYMEX 
LIJ^TVAPAVRGSNPHLQTFTFTRVEIiGEKPLRIIGVKVHPGQR 
KEQILLDLNlSYVGDVQIDVEVKKYFCKAGVKGMQLHGVLRVXIi 
EPOIGDLPFVGAVSMFFIRRPTLDINWTGMTNLLDIPGLSSIiSD 
TMlMDSlAAFt,VLPNRLLVPLVPDr*QDVAQLRSPI*PRGlIRIHL 
LAARGIiSSKDKYVKGLIEGKSDPYAlrVRLGTQTPCSRVIDEELM 
PQWGETYEVMVHEVPQQEIEVEVFDKDPDKDDFLGRMKLDVGKV 
IiQASVI*DDWFPLQGGQGQVHLRLEWLSLLSDAEKLBQVLQMNV7G 
VSSRPDPPSAAILWYLDRAQDI,PMVTSEI.yPPOr.KKGNKEPNP 
MVQLS IQDVTQES KAVYSTNCPVWEEAFRFFLQDPQSQELDVQV 
KDDSRAI,TIiGAIiTI*PIJ^I.TAPELII«DQWFQLSSSGPNSRl.YM 
KIiVMRILYLDSSEICPPTVPGCPGAWDVDSENPQRGSSVDAPPR 
PCHTTPDSQFGTEHVI.RIHVLEAQDLIAKDRPIJGGLVI03KSDPY 
VKLKLAGRS PRSHWREDUTPRWNEVFEVIVTS VPGOEIiEVEVf 
DKDr4DKDDFliGRCKVRLTTVLtISGFrJDEWLTLEDVPSGRX.HIiRL 



319 



wo 01/53312 



PCT/USOO/34263 



ID 
NO: 


predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


i:'reaicted end^ 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acio segment containing signal peptide 
(A=Alanine, c==Cysteine, D^Aapartic Acid, E- 
Glutamic Acid, F= Phenyl alanine, G=Glyciie, 
H=Histidine, I«Isoleucine, K-Lyaine 
L=Leucine, M=,Methicnine, N^Asparaginc, 
P=?roline, Q-Glutarrine, R«Arginine, 
SoSerine, T=Threonine, V=Valine, 
W=Tryptophan, y«Tyrosine, X^Unknown, *=rStop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


5428 






ERLTPRPTAAELKKVLOVNSLlgiUK^AElJVAALLSIVMEEiAEir- 

LPLRKGTKHLSPYATLTVGDSSHKTfCTISQTSAPVWDESAS^I 

RKPHTESLELQVRGEGTGVLGSLSLPLSELLVADQLCLDRWFTL 

SSGQGQVLLRAQLGILVSQHSGVEAHSHSYSHSSSSLSEEPELS 

GGPPHITSSAPEWRQRLTHVDSPLEAPAGPLGQVKLTLWyySE 

ERKLVSIVHGCRSLRQNGRDPPDPYVSLLLLPDKNRGTKRRTSQ 

KKRTLSPEPNERFEWELPUDEAQRRKLDVSVKSNSSFMSREREL 

LGKVQLDLAETDLSOGVARWYDIiMDNKDKGSS 


5429 


^ 1839 


" "J i-iuAj^rt W4 J. /ur' i^'vixtvi^iyH i/AK PAyLyKPtiXMVEDQAEEIiED 

LVHFSVSELPSRGVGVMEEIRRQGKLCDVTLKIGDHKFSAHRIV 

lAASiPYFHAMFTNBMMECKQDEIVMQGMDPSALBALlNFAYNG 

NIAIDQQNVQSI,U4GASFZ.QLQSIKDACCTFLRERLHPKNCLGV 

RQFAETMMCAVLYDAANSFIHQHFVEVSMSEEFIALPI*EDVLEL 

VSRDEEiM\/JC^P*T7AX/t?t?liXT Rtinm\rrMiEij^r\j^mr«r i 

v^tujauvt viiL;>j:,iiyvFbAAl*AWWYDREQRGTPL\RNIjQSNlRLL 

FCRPQFLSDRVQQDDLVRCOHKCRDLVDEAKDYLLMPERRPHLP 

AFRTRPRCCTSIAGLIYAVGGLNSAGDSLNWEVPDPIANCWER 

CRPMTT7VRSRVGVAWNGLLYAIGGYDGQLRLSTVQAYNTETDT 

WTRVGSMNSKRSAMGTWIiDGQI YVCGGYBGNSS I,SSVETYSPE 

TDKWTWTSMSSNRSAA\GVTVFEGRIYVSGGHDGLQIFSSVEH 

YNHHTAXraPAAGMLNKRCRHGAASLGSKMPVCGGYDGSGPLSI 

AEMySSV\ADQWCLIVPM\HTRR\SRVSXX]fGPAVGRLYAVWGVT 

TGQSNL\SSVGDVLTPETDC^TFM\APMACHEGGVGVGCIPI,LT 


5430 


828 202 


RREDALSSEGGIiWPSESTVSGNGIPEPQVYAPPRPTDRIAVPPF 
AQRERFHRFQPrYPyLQHEIDLPPTISItSDGEBPPPYQGPCTSQ 
LRDPEQQLELNRESVRAPPNRT1FDSDZ,MDSAR1X3GPCPPSSNS 
GISATCYGSGGRMEGPPPVXYSEVIGHYPGSSFQHQQSSGPPSr* 
liEGTRbHHTHlAPLESAAIWSKEKDKQKGHPL 


5431 


441 1507 


QKRRKRRRKKIMKTIQPKHHNSISWAIFYGIAALCLFQGVPVRS"^ 
GDATFPKAMDNVTVRQGESATIJiCTIDNRVTRVAWIiNJ^ 
GNDKWCLDPRWLLSOTQTQYSIEIQNrVDVYDEGPYTCSVQTDM 
HPKTSRVHLIVQVSPKIVEISSDISINEGNNISLTCIATGRPEP 
TVTWRHISPKAVGPV'SEDEYljElOGTTRFn'^nYPr'Q ivcMT\if \ t» 

APV\VRRVKVTVNYPPYISEAKGTX5VPVGQKGTLQCEASAVPSA 
EFQWYKDDKR1.I/EGKKGVICVENRPFLSKI.IFFNVSEHDYGNYT 
C^NKLGHTNAS IMLPGPGAVSEVSNGTSRRAGCVWtXPLLVL 


5432 


2 1312 


AAAAPGSRRRRPLPDRPHMAHGYEAPPPPAPRSPAWRARSKPVT" 

IiPGITlNP\TIAEGPSP\TSEGASEAl!aiVDLOKKr.EEI^U)EQQ 

KKRLEAFLTQKAKVGELKDDDFERISELGAGNGGWTKVQHRPS 

GLlMARKLlHtiEIKPAlRNQIIRELOVI^HBCNSPYIVGFYGAPY 

SDGElSICMEHKDGGSLDQVLKEAKRIPEEILGKVSIAVIiRGLA 

YLREKHQIMHRDVKPSWILVNSRGEIKLCDFGVSGQUDSMANS 

FVGTRSyMAPERiQGTHYSVQSDIWSKGLSLVELAVQRYPXPPP 

DAKEUSAIFGRPWDGEEGEPHSISPRPRPPGRPVSGHGMDSRP 

AMAIFELI.DYIVNEPPPKLPNGVFTPDFQEFVNKCI1IKNPABRA 

*j^iu ujxxYX) i r vjcvn vi/FAGWIiCKTIiRIjNQPGTPTRTAV 


5433 




1 1312 
[ 

1 I 

i 
I 


AAAAPGSRRRHPUPDRPHMAHGYEAPPPPAPRSPAWRARSiCPVV" 

LPGI'riNP\TlAEGPSP\TSBGASEANI.VDLQKICLBELEU5EQO 

KKRLEAFLTQKAKVGELKDDDFERrSET^AGMGGVVTKVQHRPS 

3LIMARKLIHLn:iKPAlRHQIIREIiQVI,HECHSPYIVGFYGAFY 

3DGEISICMEHMDGGSLBQVLKEAKRIPEEILGKVSIAVI.RG1A 

SfI.RBKHQlMHRDVKPSNTLVNSRGEIKLCDFGVSGQIiIDSMANS 

FVGTRSYHAPERLQGTHYSVQSDIWSMGLSLVELAVGRYPIPPP 

3AKELEAI FGRPWDGEEGEPHS ISPRPRPPGRPVSGHGMDSRP 

^lAIPELLDYrVNEPPPKLPNGVFTPDFQEFVNKCLIKNPAERA 

5I*KMLTNHTFi:<RSEVEEVDFAGWZ.CKTI,RI/tfQPGTPTRTAV 




360 


1885 £ 
I 


>VQED£CVGFEDPLHLCSWRARACPCTWPHCyCTGLLECLGFAGV 
iFGWPSLVFVFKNEDYFKDLOGPDAGPIGNATGQADCKAQDERB 
;LIFTLGSFMN^^F^3TFPTGYIFDRFKTTVARIlIAIFFYTTATri? 
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SEQ 
ID 
NO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to firet 
amino acid 
residue of 
amino acid 
sequence 



preaicted end~ 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Amino acid segment containing signal peptide 
(A^Alanine, CoCysteine, D-Aspartic Acid, E= 
Glutamic Acid, F-Pheuylalanine, G=Glycine, 
H-Histidine, l^Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N^Asparagixie, 
P= Proline. Q=Glut amine, R-Arginine, 
S=Serine, T=Threonine, V=Valine, 
W-Tryptophan. Y=.Tyrosine, X=.Unknown, *=.Stop 
Codon, /=po3sible nucleotide deletion. 



\^possible nucleotide inaertion ) 
XAFTSAGSAVbLFlAMPMLXiGGILFLITWLUKSNIiFGQHRSTI 
ITLyNGAFDSSSAVFLrrKLLYEKGISLR/VI,LHtaHLCLQYSvC 
SrHFPPDAPGAHPlPTAPQLQLWPVPWEWHHKGREG/QQI^SMKT 
GSYSQRSSFQRRKKPQGQGRSRNSAPSGATL/CSRRPAWHLVWL 
SVIQLWHYLFIGTUtlSLLTNMAGGDMARVSTrrNAFAFTQFGVL 
CAPWWGU^RLKQKYQKESOilCTGSSTIAVALCSTVPSlALrSI, 
LCI/3FALCASVPILPLQYLTF1LQVISRSFLYGSNAAFLTIAFP 
SEHFGKLPGLVMALSAWSLLQFPIFTLIKGSLQNDPFYVNVMF 
NLAIU/TFFHPFLVYRECRTWKESPSAIA 



S435 



4 704 



1597 



KIAAliIISLlQHKLLWRNQHCSRCVIMSPAySAGUJWLP/GSGK 
HGPFUSCSQYPACDYVRPLKSSADGMIVKVLKGQVCPACGAMLV 
LRQGRFGMFIGCINYPECEHTBLIDKPDETAITCPQCRTGHLVQ 
RRSRYGKTFHSCDRYPECQFAINFKPIAGBCPECHYPLLIEKKT 
AQGVKHFCASKQC5GKPVSAE 



5436 



5437 



1781 



739 



1672 



2443 



1152 



PGusauRiiAEMSNAKERKHAKKMRNQPTNVTJUSSGPVADRGVKH 

HSGGEKPFQAQKQEPHPGTSRQRQTRVNPHSLPDPEVNEQS2SK 

GMFRKK<3GWKAGPEGTSQEIPKyiTASTFAQARAAEISAMI*KAV 

TQKSSNSLVFQTI.PRHMRRRAMSHNVKRLPRRLQEIAOKEAEKA 

VHQKKEHSKWKCHKARRCHMNRTI.EFNRRQKKMIWLETHIW1IAK 

RF«^n/^CKh'GYCLGERPTVK5HRACYRAMTNRCr.IK2DLSYYCCr,E 

LKGKEEEILKALSGMCNIDTGLTFAAVHCLSGKRQGSLVLVRVM 

KYPREMI^PVTFIWKSQRTPGDPSESRQLWIWI»HPTI,KQDI1.EE 

IKAACQCVEPIKSAVCIADPLPTPSQEKSQTELPDBKIGKKRKR 

KDDGEJJAKPIKKIIGDGTRDPCI^PYSWISPTTGIIISDLTMEMM 

RFRLIGPLSHSILTEAIKAASVHTVGEDTEETPHRWWIETCKKP 

DSVSX^CRQEAIFELLGGirSPAEIPAGriLGLTVGDPRINLPQ 

KKSKALPWPEKC«DNBKVRQIJ[J.EGVPVECTHSFIM5IQDICKSV 

TENKISDQDLNRMRSELLVPGSQLILGPHESKIPILLIQQPGKV 

TGEDRLGWGSGWDVLLPKOWGMAFWIFFIVRGVRVGGtiKESAVH 

SQYKRSPNVPGDFPDCPAGMLPAEEQAKNI,LEKYKRRPPAKRPN 

YVKLGTIiAPFdCPWEQLTQDWESRVQAYEEPSVASSPNGKESDL 

RRSEVPCAPMPIOCTHQPSDEVGTSIEHPREAEEVMnAGCQBSAG 

PERITDQEASEWHVAATGSHLCVLRSRKLLKQLSAWCGPSSEDS 

RGGRRAPGRGQQGLTREACLS IliGEFPRALVWVSLSLLSKGSPE 

PHTMICVPAKEDPLQLHBDWHYCGPQESKHSDPFRSKIUCQKEK 

kkrekrqkpVgrassdgpageepvagqealtlglwsgplprvtii 

HCSRTLLGFVTQGDFSMAVGCGEAW3FVSLIGLLDMLSSQPAAQ 
RGLVLLRPPASIiQYRFARIAIEV 



ASDSIPWSEARTTRKTJ^QRGCQWSLPERMPLVVFCGLPYSGKSR- 
RAEELRVATAAEGRAVYWDDAAVLOAEDPAVYGDSAREKALRG 
AlJlASVERRI.SRHDWILDSLKYIKGPRYELY\crARAARTPLC 

lvycvrpggpiagpqvaganenpgrnvsvswrpraeedgraqaa 

GSSVLRELHTADSWNGSAQAt>VPKELEREBSGAAESPAI,VTPD 
SEKSAKHGSGAFySPELLEALTLRF£;APDSRNRWDRPLFTLVGL 
EEPIiPLAGIRSALFENRAPPPHQSTQSQPliASGSFiaQLDQVTS 
QVLAGLMEAQK^AVB3DLLTJUPGTTBHIJJFTRPLTMAEI.SRLRR 
QFISYTKMHPNMEWIiPQLANMFLQYLg QSUI 

UQEAASEFGGpi^TPAMFlMlXSGHLPRPWGRlOr PMRPDPPYPE "" 

PRRVDSSSENSGSDWDSAPETMEDVGHPKTKDSGATiRVSRAASE 

PSKEEPQVEQLGSKRMDSLKWDQPISSTQESGRI^GGASPKLR 

WOHVDSGGTRRPGVSPEGGLXGVPGPGAPLEKPGRREKLLGWLR 

GEPGAPSRYI/SGPEECLQISTNLTLHLLELLASAIalALCSRPLR 

AAU3TLGLRGPLGLWLHGLLSFIAALKGLHAVLSLLTAHPLHFA 

CLPGLLQALVLAVSLREPNGDEAATDWHSEGLEREGEEORGDPG 
KGL 



TKPRKRRHQPA50RQRPWSSbSTGDLIiARGK GKKi:;BNK^ 

MPPSLRRPMMCQSEARQGPEljy^AKWLHFPQLAriRRRl^t^C 

MSRPALKLRSWPLTVLYYLLPFGALRPLSRVGWRPVSRVALYKS 

VPTRLLSRAWGRUMQVELPHWLRRPVYSLYIWTFGVNtyiKEAAVE 

DlJHHYRNbSEFFRRKLKPQARPVaSLHSVISPSDGRII^WPGQVli' 

NCEVEQVKGVTYSLESFLGPRMCTEDLPFPPAASCDSFKKQLVT 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue o£ 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptiai™ 
(AsAlanine, C^^Cysteine, D=Aspartic Acid, 
Glutamic Acid, P= Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K-Lysine, 
Lx>Leucine, M»Methionine, N~Asparagine, 
P=Proline, Q=Glutamine, R-Arginine, 
S=Serine, T=Threonine, V^Vallne, 
W^Tryptophan, y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, ' 
\=possil3le nucleotide insertion) 








REGNELYHCVIYLAPCDYHCFHSP'rDWrVSHRRHPPGSLMSVNP 
GMARWIKEIiFCHNBRWL'rGDWKHGPFSLTAVOATVNWGSIRiy 
FDRDiaTNSPRHSKGSYimFSFVTHTNREGVPMALRGEHLG /QS 
tNIjGSTIVLI FEAPKDFNFQLKTGQKIRFGEAIiGSIi 


543 9 


2443 


1152 


TKPRKRRHQPASQRQRPWSSDSTGDI.LARGKGRKEENKGSDRVS'""' 

LAPPSLRRPMMCQSEARQGPELRAAKt^LHPPQLAl^RRLGQLSC 

MSRPALKLRSMPLTVLYYLl,PFGAtiRPLSRVGWRPVSRVAIiyKS 

VPTRLLSRAWGRLNQVELPHWiiRRPVYSLYIWTFGVNMKBAAVE 

DIiHHYRNLSEFPRRKLKPQARPVCGLHSVISPSDGRILNFGQVK 

NCEVfiQVKGVTYSUBSPLGPRMCTEDLPFPPAASCDSFKNQLVT 

REGNELyaCVIYLAPGDyHCFHSPTDWTVSHRRHPPGSLMSVNP 

GMARWIKELFCHNSRWLTGDWKHGFPSLTAVGATVNWGSIRIY 

PDRDLHTNSPRHSKGSYNDFSFVTHTNREGVPMAJuRGEHLG/QS 

PKLGSTIVLIFEAPKDFNFQLKTGQKIRFGEALaSL 


5440 


693 


253 


EPIPVTPDHRIjVTHTHIV\QTFSPVN5\G0PPNYEMIiKEEQEVA 
MI<5APHNPAPPMSTVXHIRSETSVPDHVVWSriFNTLFMNTCCLG 

FlAFAYSVKSRI>RKMVGDVTGAQAyASTAKCLNIWAL.ILGIFMT 
ILLIIIPVLVVQAQR 


5441 


2 


20S4 


CRDGGKNGFMVSPMKPLEIKTQCSGi'RMDPKICPXbPAFFSPIN 

NSDLWVANIETGEERRLTFCHQGLSNVLDDPKSAGVATPVIQEE 

PDRFTGYWWCPTASWEGSEGLKTtiRILYEEVDESEVEVIHVPSP 

ALEERKTDSYRYPRTQSKNPKIAUCIiAEFQTDSQGKIVSTOEKE 

LVQPPSSLFPKVEYIARAGWTRDGKYAWAMFIiDRPQQWLOLVLli 

PPALPIPSTENBEQ\RIASARAVPRNVQPYWYEEVTNVWINVH 

DIFYPFPQSEGEDELCFLRANECKTGFCHLyKVTAVLKSQGyDW 

SEPFSPGEGEQSI,TNAIWVNEETKLVYFQGTKDTPl.EHHLyWS 

YBAAGEIVRLTTPGFSHSCSMSQNFDMFVSHYSSVSTPPCVHVY 

KLSGPDDDPLHKQPRPWASMMEAAKIFHPHTRSDVRLYGMIYKP 

HALQPGKKHPTVLFVyGGPQVQLVNNSFKGIfCYIjRIiNTIAStXSY ^ 

AVWIDGRGSCQRGLRFEGAI^KNQMGQVEIEDQVEGLQFVAEKy 

GFIDLSRVAlHGWSYGGPLSLMGIilHKPQVFKVAlAGAPVTVWM 

AYDTGrrERYMDVPENNQHGyEAGSVAI*HVEia.PNEPNRLLIXiH 

GFLDENVHFFHTNFLVSOLIRAGKPYQLQVAJLPPVSPQIYPNER 

HSIRCPEsGEHyEVTIiLKFLQEYL 


5442 


X 


34 74 


CGQRSRRRSPDMPEAKPAAKKAPKGKDAPKGAPKEAPPKEAPAE " 
APKEAPPEDQSPTAEEPTGVPr,KKPDSV3VBTGKDAVWAKVNG 
KELPDKPTIKWFKGKWliELGSKSGARFSFKESHKSASNVYTVEIi 
HIGKWLGDRGYYRIjEVKAKDTCDSCGFNIDVEAPRQDASGQSL 

esfkrtsbkksdtageldfsgllkkrevveeekkkkkkddddlg 
ippbiwbllkgakkseyekiafqygitdlrgmlkrlkktocvevk 
ksaapl'kkiidpayqvdrgnkiki/mvsisdpdltlkmpkngqeik 

PSSKYVPENVCKKRILTINKCTLADDAAyEVAVKDEKCFTELFV 

keppvlivtpledqqvfvgdrvemavevseegaqvmwmkdovbl 

TREDSFKARYRPKKDGKRHILIFSDWQEORGRyQVITWGGQCE 
AELrVEEKQIiEVLQDlADriTVKASEQAVFKCEVSDEKVTGKWyK 
NGVEVRPSKRITISHVGRFHKLVIDDVRPEDEGDYTFVPDGYAL 
GSLSAKI»NFLEIKVEYVPKQ\EPPKIPLGFASGGKTSENAD/IV 
WAGNKIiRLDV\SITGEAPSPFAT\WI,KG\DEVFTrrEGRTRIE 
KRVDCSSFVIESAQREDEGRYTIKVTMPIGEDVASIFLQWDVP 
DPPEAVRITSVGEDWAILVWEPPMYDGGKPVTGYIiVERKiCKGSQ 
RWMKLNFEVPTSTTYESTKMIEGILYEMRVFAVNAIGVSQPSMN 
TKPFMPIAPTSEPIiHLIVEDVTDTrrTLKWRpPNRIGAGGIPGy 
LVEYCLEGSEEtWANTEPVEROJFTVKNLPTGARlIiFRVVGVN 
lAGRSEPATLAQPVTlREIAEPPKIRbPRHLRQTyiRKVGEQLN 
LWPFQGKPR PQ WVrr KGGAPLDTSRVHVRTSDFDTVFFVRQAA 
RSDSGEYELSVQIENMKDTATIRIRWEKAGPPIWVMVKEVWGT 
KALVEWQAPKDDGNSE IMGYFVQKADKKTMBWFKVYERNRHTSC 
TVSDLXVGNEyypRVYTENICGLSDSPGVSKNTARILKTGlTFK 
PPEYKEHDFRMAPKFLTPI,XDRVWAGYSARLNCAVRGHPKPKV 
VWHKNKMEIREDPKFLI TmrQGVLTLNXRRPSPFDAGTYTCRAf 
NELGEAXjAECKLSVRVPQ 
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SEQ 
XD 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid, Ea: 
Glutamic Acid, F-Phenylalanine, G«Glycine, 
H=Histidine, I-lsoleucine, KsLysine, 
L=Leucine, M^Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S= serine, T^Threonine, V= Valine, 
WeTryptophan, YaTyroaine, X=Unknown, *==Stop. 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


S443 


66 


1003 


SRGQIiDAGQSS EQHGGWRQPEQSRS RSSSSSSSPRRSRSAAEPA 
MALSMPLNGI>KEEDKEPLIEr,FVKAGSDGESIGNCPFSQRLFtlI 
LWLKGWFSVTTTVDLKRKPADLQNtiAPGTHPPFlTFNSEVKTDV 
NKIEEFLEEVLCPPKYLKLSPKHPESNTAGMDIFAKPSAYIKNS 
RPEANEALERGLLKTLQKLDEYLMS PI^PDEIDEMSMEDIKTSTR 
KFliDGNEMTLMGNUjPKLHIVKVVAKKyRNFDIPKEMTGIWRY 
L'rarAYSRDEFTNTCPSDKEVElNAYSDVAKRIiHQVKSRLIiKEVS 
FMSSP 


5444 


2 


344 


SGPIGVTGAQMAKWIiRDYLSFGGRRPPPQPPTPDYTESDlIoRAy 
RAQKNLDFEDPY*DSESRLEPDPAGPGDSKNPGDAKYGSPKHRL 
IKVEAADMARAKALLGGPGEELEADTEYLDPFDAQPHPAPPDDG 
yMEPYDAQWVMSELPGRGVQLYDTPYEEQDPETADGPPSGOKPR 
QSRMPQEDERPADEYDOPWEWKKDHISRAFAVQPDSPjEWERTPG 
SAKELRRPPPRSPQPAERVDPALPLEKQPWFHGPIiKRADAESi:.Li 
SLCKEGSYLVRLSETNPQDCSLSLRSSQGFLHLKFARTRENQW 
I^QHSGPFPSVPBfiVLKYSSRPLPVQGAEHLALLYPWTQTP*Q 
*PDWGDRRPNGQVATGLPELWGAEAPSAAAHPGIJIRERHPEX3r*P 
RAEKPGLRGPLLGLREPLGAGPRGPWGLQEPRRCQVMFSQAPAH 
QGGGCGYGQSQGPSGRPRGGAGSRH 


544 5 


2364 


486 


ILSRGFLGSVEICIQLPLPASEPVI.IitjTWARRRWRETRSRREPT 
TLRAQS VCPW WI * ETRMNRS I PVEVDESEPYPSQLLKP I PEYS P 
EEESEPPAPNIRNMAPNSLSAPTMIiHNSSGDFSQAHSTI.KUVNH 
QRPVSRQVTCI^RXQVLEDSEDSFCRRHPGLGKAFPSGCSAVSEP 
ASESWGALPAEHQFSFMEKRNQWLVSQLSAASPDTGHDSDKSD 
QSLPNASADSLGGSQEMVQRPQPHRNRAGI.DLPTICTGYDSQPQ 
DVLGXRQLERPl^PLTSVCYPQDLPRPLRSREFPQPEPQRYPACA 
QMtiPPNr>SPHAPWMyHYHCPGSPDHQVPyGHDYPRAAYQQVIQP 
ALPGQPLPGASVRGLHPVQKVILNYPSPWIX^EERPAQRDCSFPG 
IiPRHQDQPHHQPPNRAGAPGESLECPAElJlPQVPQPPSPAAVPR 
PPSNPPARGTLKrSNLPEEbRKVFITYSMDTAMEWKFVNFUjV 
NGFQTAIDIFfi^bRIRGIDriKWMBRYLRDKTVMirVAISPKYKQ 
DVEGAESQLDEDSHGLHTKYIHRMMQIEFIKQGSMKFRPIPVLF 
PNAKKEHVPTW1,QNTHVYSWPKNKKNILLRLLREEEYVAPPRGP 
LPTliQWPL 


5446 


972 


161 


SSWSWCTGRMRKTRItWGraiWMLFVSELRAATKLfEBKYELKEGQ " 

TLDVKCDYTLEKFASSQKAWQIIRDGEMPKTLACTfiRPSKNSHP 

VQVGRIIiBDYHDHGIiliRVRMVNLQVEDSGLYQCVIYQPPKEPH 

MLFDRIRLWTKGPSGTPGSNENSTQNVYKIPPTTTKALCPLYT 

TPRTVTQAPPKSTADVSTPDSElKI,TNVTDIIRVPVFNIVirtLA 

GGPIiSKSLVFSVLFAVTLRSFVP*AHEPTRMSSDFQPHPSGSCA 

KOGOBIR 


5447 


207 


617 


WXARTIiStiMASLVAYDDSDSEAETEHAGSFNATGQQKDTSGVAR 
PPGODFASGTI.DVPKAGAQPTKHGSCEDPGGYRI1PI1AQLGRSDR 
GSCPSQRliQWPGKEPQVTFPIKEPSCSSLMTSHVPASHMPlAAA 
RFKQVKLSRNPPKSSFHAQSESETVGKMGSSFQKKKCEDCWPy 
TPRRLRQRQALSTETGKGKDVEPQGPPAGRAPAPIiYVGPGVSEF 
IQPYLNSHYKETTVPRKVLPHLRGHRGPVNTIQWCPVIiSKSHML 

UOAOPJi^i^J-t ivv wiirtvuoufn^lJUJ. lOljtlllSftVKAAXW/iirlJviKjKlli 

SGGFDFALHLTDLETGTQLFSGRSDFRITTliKFHPKDHNIFLCG 
GFSSEMKAWDIRTGKVWRSYKATIQQTLDItFLREGSEFIiSSTD 
ASTRDSADRTl I AWDFRTSAKXSNQ I FHERFTCPSIALHPREPV 
FLAQTNGNYLALFSTVWPYRMSRRRRYEGHKVEGYSVGCECSPG 
GDLLVTGSADGRVLMYSPRTASRACTLQGHTQACVGTTYHPVtiP 
SVLATCSWGGDMia:WH*AFHWIiSIiGEAIGDI*APARGYSGPGRSIi 
KSPSPSKSLLVIiCGRAMFQPATCPWQIiPALSK 


5448 


194 


1833 


MASKOTDAIVWYQKKIGAYDQQIWEKSVEQREIKGI.RNKPKKTA 
HVKPDliXDVDLVRGSAFAKAKPESPWTSLTTKGIVRWFPPFFF 
RWWI^VTSKVIFFWIiLVLYLLQVAAlVLFCSTSSPHSIPLTEVI 
GPIWLMLLI/5TVHCQIVSTRTPKPPLSTGGKRRRKLRKAAHLEV 
HREGDGSSTTDNTQEGAVQNHGTSTSHSVGTVFRDLWHAAPFit 
GSKKAKWSIDKSTETDKGYVSLDGKKTVKSGEDGIQNHEPQCBT 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
secjuence 


Amino acid segment containing signal peptide"" 
(A=: Alanine, C=Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F=Phenylalanine , G=.Glycine, 
H=sHistidine, I^Isoleucine, K-Iiycine, 
L^Leucine, M^Methionine, N=Asparagine, 
P=Proline, Q»Glutamine, R^Arginine, 
S^Serine, r= Threonine, V» Valine, 
W==Tryptophan, y:«Tyrosine, X^Unknown, *=.Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








IRPEETAWNTGTLRNGPSKOTQRTITNVSPEVSSBE^PETGYSL 

rrhvdrtsegvlrnrkshhykkhypnedapksgtscssrcssSr 

QDSESARPESETEDVLWEDLLHCAECHSSCTSETDVENHOINPC 
VKKEyRDDPFHQSHIiPWLHSSHPGLEKXSAlWEGNDCKKADMS 
VLEISGMIMNHVNSHIPGlGYQIFGWAVSI/ir^LTPFVFRI^SQfl 
TDLEQLTAHSASELYVIAFGSNEDVXVLSMVI ISFVVRVSLVWI 
FFFX.I.CVAERTYKQVGIM*TSEGVLRNRKSHHYKKHYPNEDAPK 
SGTSCSSRCSSSRQDSESARPESETEDVLWEDLLHCAECHSSCT 
SETDVENHQINPCVKKEYRDDPFHQSHLPWLHSSHPGLEKISAI 
VWEGNDCKKADMSVLEISGMIMNRVNSHIPGIGYQIKSNAVSCI 
LGLTPFVFRLSQATDLEQLTAHSASELYVIAFGSNEDVIVI*SMV 
I IS FWRVSIiVWI FFFLLCVAERTYKQVGIM 


5449 


194 


1833 


MMKVTDAIVWYQKKIGAYDQQIWEKSVEQREIKGLRtJKPKKTA" 

HVKPDLIDVDLVRGSAFAKAKPESPWTSLTTKGIVRWPFPFFF 

RWWLQVTSKVIFFWLLVLyi,::4QVAAIVLFCSTSSPHSIPLTEVI 

HREGDGSSTTDNTOEGAVQNHGTSTSHSVGTVFRBLWHAAFFEiS 

GSKKAKNSIDKSTETDNGYVSLDGKKTVKSGEDG IQNHEPQCST 

IRPEETAWNTGTLRNGPSKDTQRTITETVSDEVSSEEGPETGYSr* 

RRHVDRTSEGVI>RNRKSHHYKKHYPNEDAPKSGTSCSSRCSSSR 

QDSKS ARPESETEDVLWEDLIjHCAECHS SeXSETDVENHOINPC ' 

VKKEYRDDPFHQSHLPWLHSSHPGLEKISAIVWEGMDCKKADMS 

VIiEISGMIMNRVWSHlPGIGYQlPGNAVSLILGLTPPVFRLSQA 

TDLEQLTAHSASEIiWI AFGSNEDVI VIiSMVI IS FWRVSLVtf I 

FPFLIiCVAERTYKQVGIM^TSEGVLRNRKSHHYKKHYPNEDA^K 

SGTSCSSRCSSSRQDSESARPESBTEDVIiWEDLIiHCAECHSSCr 

SETDVBNHQINPCVKKEYRDDPFHQSHLPWLHSSHPGLEKISAI 

VWEGNDCKKADMSVLEISGMIMNRVNSHIPGIGYQIfXSNAVSIiI 

U3LTPFVFRLSQATDliEQI,TAHSASELYVlAFGSNSDVIVI*SKV 

IISFWRVSLVWIFFFULiCVAERTYKQVGIM 


54S0 


8136 


1242 


GQQFASFFG*NHPEVTVAMALTDIDLQLQFSMSQPEALLliIAAG 
PADHLLIiQLYSGHLQVRLVX/QQEELRLQTPAETLLSDSIPHTW 
LTVVEGWATIiSVDGFLNASSAVPGAPLEVPYGLFVGGTGTIiGLP 
YLRGTSRPLRGCI>HAATLNGRSIii:jRPX»TPDVHEGCAEEFSASDD 
VALGFSGPHSLAAFPAWGTQDEGTLEFTluTTQSRQAPIiAFQAGG 
RRGDPlYVDIFEGHLRAWEKGQGTVLLHNSVPVAIKSaPHEVSV 
HINAHRLEISVDQYPTHTSNRGVLSYLEPRGSLLLGGLDAEA^R 
HIiQEHRIiGLTPEATNASLLGCMEDLSVlIGQRRGLREAIit.TRNMA 
AGCRLEEEEYEDDAYGHYEAFSTXAPBAWPAMEIiPEPCVPEPGL 
PPVFANFTQi:,LTISPLWABGGTAWIiEWRHVQPTIJOLMEAEIJi:< 
SQVIiPSVTRGAHYGEr.ELDILGAQARKMFTI,IiDVVHRKARFIH0 
GSEDTSDQLVLEVSVTARVPMPSCLRRGQTYIiLPIQVNPViroP? 
HIIFPHGSI^ILEHTQKPLGPEVFQAyDPDSACEGLTPQVLGT 
SSGI*PVBRRDQPGEPATEFSCRELEAGSLVYVHCGGPAQDLTFR 
VSDGLQASPPATLKWAIRPAIQlHRSTGLRLAQGSAMPIIiPAN 
LSVETNAVGQDVSVLPRVTGAliQFGELQKHSTGGVEGAEWMATQ 
AFHQRDVEQGRVRYLSTDPQHHAYDTVBNIALEVQVQQEILSNL 
S FP VTI QRATVWMlRI/EPLHTQNTQQBTLTTAHLEATLEEAGPS 
PPTFHYEWQAPRKGNLQIjQGTRLSDGQGPTQDDIQAGRVTYGA 
TARASEAVEDrFRFRVTAPPYFSPI^YTFPIHIGGDPDAPVLTKV 
LLWPEGGEGVLSADHLFVKSLNSASYI^YEVMERPRLGRXiAWRG 

tqdkttmvts ftnedllrgrlvyqhddset1*edd1ppvatrqge 
ssgdmaweevrgvprvaiqpvwdhapvottsrifhvarggrriil 
ttddvafsdadsgfadaqlvltrkdllfgsivavdeptrpiyrf 
ax?edtj?krrvi,fvhsgadrgwiqlqvsdgqhoatallevqasep 

YtiRVT^GSSIiWPQGGQGTIDTAVLHLDTHLDIRSGDEVHYHVT 

agprmgqlvragqpatafsqqdlldgavlyshngslspedtmaf 
sv£agpvhtoatlqvtialegpraplki.vrhkkiyvfqgeaaei 
rrdqleaaqeavppadivfsvksppsagylvmvsrgaladepps - 
r-dpvqsfsqeavdtgrvlylhsrpeamsdafstjdvasglgapll 

GVLVELEVLPAAlPIiEAQNFSVPEGGSI.TXAPPIiRVSGPYFPT 
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SEQ 
ID 

WO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, Ci=Cysteine, D^sAspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G«Glycine, 
H=Histidine^ X-Isoleucine, K» Lysine, 
L-Leucine , M=:Methionine , N«Aaparagine t 
P-Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Thireonine, V=Valine, 
K=^Tryptophan, y=Tyrosine, X=Unknown, *~Stop. 
Codon, /s=possible nucleotide deletion, ' 
\=possible nucleotide insertion) 








LLGLSLQVLEPPQHGPLQKSDGPQARTLSAFSWRMVEEQLIRYV 
HDGSETLTDSFVIJ^ANASEMDRQSHPVAFTVTVIjPVNDOPPliiT 
TNTGLQMWEGATAP I PAEALRSTDGDSGSEDI*VYTIEQPSNGRV 
VLRGAPGTEVRSFTQAQLDGGLVIjFSHRGTLDGGFPFRLSDGEH 
TSPGHFFRVTAQKQVLLSLKGSQTLTVCPGSVQPLSSQTLRASS 
SAGTDPQLI.t»YRVVRGPQIiGRLPKAQQDSTGEA3aVNFTQAEVYA 
C5N I liYEHEMP PE PFWEAHDTLELQLSS PPARDVAATLAVAVSPE 
AACPQRPSHLWKNKGLWVPEGQRARIWAAI^ASNLLASVPSPQ 
RSEHDVLFQVTQFPSRGQbliVSEEPLHAGQPHFLQSQIAAGQLV 
YAHGGGGTQOtKSFHFRAHliQGPAGASVAGPOTSEAFAITVRDVN 
ERPPOPQASVPLRLTRGSRAPISRAOLSWDPOSAPGEIEYEVQ 
RAPHNGFliSLVGGGLGPVTRFTQADVDSGRLAFVANGSS VAGI P 
QLSMSDGASPPLPMSIiAVDILPSAlEVQLRAPLEVPQAIiGRSSL 
SQOObRWSDREEPEAAYRLIQGPOYGHIXVGGRPTSAPSQPQl 
DQGEWFAFTNFSSSHDHFRVLAIARGVNASAVVNVTVRAtL^ 
WAGGPWPQGATLRLDPTVLDAGELANRTGSVPRFRLLEGPRHGR 
WRVPRARTEPGGSQLVEQFTQQDI.EDGRIjGLEVGRPEGRAPGP 

agosiitiielwaqgvppavasldpatepynaarpysvai,i.svpea 
arteagkpesstptgbpgpmasspepavakggflsfleanmfsv 

IIPMCIiVIjLLtliALILPLLFYLRKRNKTGKHDVQVLTAKPRNGliA 

gdtetfrkvepgqaipltavpgqgpppggqpdpellqfcrtpmp 
alkngqywv 


54SX 


1 


2274 


rdsseqgrtgdtlgrpsacmdalkppclwrnhergkkdrdscgr " 
knsepgsphsiieaiirdaapsqglnplllptkmljfifnflpsplp 

TPALlCII.TFGAAIFIiWLITRPQPVLPIjXJ>IJJNQSVGlEGGAaK 
GVSQKNNDLTSCCFSDAKTMYEVFQRGtAVSDNGPCLGYRKPNQ 
PYRWLSYKQVSDRAEYLGSCLIiHKGYKSSPDQFVGIFAQNRPSW 
IISEIACYTYSHVAVPIiYDTLGPEAIVHXVNKADIAMVICDTPQ 
KALVLIGKVEKGPTPSLKVXILMDPPDDDIjKQRGEKSGIEILSL 
YDAENLGKEHFRKPVPPSPEDIUSVICPTSGrrGDPXaAMITHQW 
IVSNAAAPLKCVEHAYEPTPDDVAISYLPLAHMFERIVQAWYS 
CaARVGFPQGDIRLUUDEWKTIjKPTriFPAVPRIiLKRlYDKVQNE 
AKTPI/KKFLZjaAVSSKFKELQKGIlRHDSFWDKIilFAKIQDSI* 
GGRVRVIVTGAAPMSTSVMTPFRAAMGCQVYEAYGQTECTGGCT 
FTLPGDWTSGHVGVPriACNYVKLEDVADMNYFTVNNEGEVCIKG 
•i'NVFKGYLKDPEKTQEALDSDGWLHTGDIGRMIiPWGTUaiDRK 
KNIFKX^AQGEYIAPEKIENIYNRSQPVLQIFVHGESLRSSLVGV 
VVPDTITVLPSFJUVKLGVKGSFEEt*CQNQVVREAIIiEDLQKIGKE 
SGLKTFEQVKAIFLHPEPFSIENGLLTPTLKAKRGELSKYFRTQ 
IDSIiYEHIQD 


S4S2 


1833 


113 8 


SRVPSLCiiSLSIiSIiSPSREPVAGAPGCGTAGPPAMArLWGGIiIiR 
LGSLLSIiSCIjAIiSVLLLAQLSDAAKNFEDVRCKCICPPYKENSG 
HXYNKNISQKDCDCLHWEPMPVRGPDVEAYdJRCECKYEERSS 
VTIKVTI 1 1 YLSIIXSLLLLYIWYLTLVEPILKRRI/FGHAQLIOS 
DDDIGDHQPFTVNAHDVLARSRSRANVLNKVEYAQQRWKIjQVQEQ 
RKSVFDRHWLS 


S4S3 


111 


152 0 


psipaavpqsappephreetvtatatsqvaqqppaaaapgbqav ■ 

PQEERSQQQDDIEELETKAVGMSNDGRFLKFDIE IGRGSPKTVY 

kgldtettvevawceiiqdrkltkserqrpkeeaemlkglqhpni 
vrfydswestvkgkkcivlvtelmtsgtlktylkrfkvmkikvl 
rswcrqir.k:gi:*qfi.htrtppiihrdc»kcdnifrtgptosvkigd 
lgiatucrasfaksvigtpefmapet^eekydesvdvyapgmcm 

LEMATSEYPYSECQNAAQIYRRVTSGVKPASFDKVAIPEVKEII 
EGCIRQNKDERYS IKDLtiNHAFFQEBTGVRVEJaAEEDDGEKIAI 
KliWIiRIEDIlCKLKGKYKDNEAIEPSFDLERNVPEDVAQEMVKSG 
YVCEGDHKTMAKAIKDRVSLI KRKREQRQL* 


54S4 


111 


1S20 


PSIPAAVP0S7VPPEPHRBETVTATATSQVAQQPPAAAAPGEQAV 
AGPAPSTVPSSTSKDRPVSQPSLVGSKEEPPPARSGSGGGSAKE 
PQEERSQQQDDIEEIiETKAVGMSKDGRPLKFDIEIGRGSFKTVf 
KGLI7rETTVB\WWCEI»QDEUaLTKSERQRFKEEAEMIiKGLQHPNI 
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1 SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amir.o acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleocide 
location 
correspond! ncf 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanlne, C=Cysteine, D^Aspartic Acid, E» 
Glutamic Acid, F=: Phenyl alanine , G^^Glycine, 
H=Hietidine, I^Isoleucine, K« Lysine, 
L»Leucine, M^Methionine, Jf'=Asparagine, 
P"Proline, Q=Glutamine, R-Arginine, 
S^Serine, T=Threonine, V^Valine, 
Wr^Tryptophan, Y=Tyrosine, X=Unknown, *==Stop 
Codon, /-possible nucleotide deletion, ' 
\fepossible nucleotide insertion) 








VRFYDSWESTVKGKKCIVLVTELMTSGTLKTYLKRFKVMklKVIi 
RSWCRQILKGLQFLHTRTPPIlHRDLKCDNiPITGPTGSVKlfeD 
LGIiATLKRASFAKSVIGTPEFMAPEMYEEKYDESVDVYAFGMCM 
LEMATSEYPYSECQNAAQIYERVTSGVKPASFDKVAlPEVKEll 
EGCIRQNKDERYSIKDLLNHAFFQEETGVRVELAEBDDGEKIAI 
KLWLRIEDIKKLKGKYKDNEAIEPSFDLERNVPEDVAQEMVESG 
yVCEGDKKTMAKAXKDR VSLIKRKREQRQL * 


545S 


13 59 


377 


LTMVSPATRKSLPKVKAMDFITSTAIIiPLLFGCLQVFGL'FRIJ^ 
WVRQKAYLRNAVWITGATSGUSKECAKVFYAAGAKLVLCGRKG 
GALEELIRELTASHATKVOTHKPYLVTFDLTDSGAIVAAAAEII, 
.QCFGYVDI LVNNAGISYRGTIMDTTVDVDKRVMETNYFGPVALT 
KALhPSMIKRRQGHIVAISSIQGKMSIPF'RSAYAASKHATQAFF 
DCLRAEMEQY E 1 EVTVI S PGYIHTNLS VMAITADGSRYGVMDTT 
TAQGRS PVEVAQDVIJWIVGKKKKD Vr IJUDLLPSrAVyr^RTIAPG 
LFFSLMASRARKERKSKWS 


5456 


2 


2332 


CGAGLVAAGAVLVLYPASRAGERTRVP3SPAPSSLPLHSPGACG 
TEVDMDPQRSPLLEVKGNIELKRPLIXAJPSQI/PI^GSRXiKRRPD 
QMEDGLEPEKKRTRGIX5ATTKITTSHPRVPSLTTVPQTQGQTXA 

KKPSKRPAWDLKGQLCDLNAELKRCRERTQTLDQEWQQLQDQIiR 

DACKJQVKALGTERTTLEGttlAKVQAQAEQGQQELKNLUACVLBIi 

EERLSTQEGLVQELQKKQVELQEERRGLMSQLEEKERRLQTSEA 

AI^SSQAEVASLRQETVAQAAUCTEREERLHGLEMERRRliHNQlj 

QEI»KGNIRVFCRVRPVI,pQEPTPPPGIjr*LFPSGPGGPSDPPTRIi 

SLSRSDERRGTLSGAPAPPTRHDFSFDRVFPPGSGODEVFESIA 

MIiVQSALDGYPVCIFAYGQTGSGKTFTMEGGPGGDPQLEGLIPR 

ALRHLFSVAQELSGQGWTYSFVASYVEIVNBTVRDLLATGTRKG 

QGGECElRRAGPGSEELTVTNARYVPVSCKKEVDAlJJHLARQiJR 

AVARTAQNERSSRSHSVFQLQISGEHSSRGLQCX3APLSLVDIAG ' 

SERliDPGLALGPGERERLRETQAINSSLSTLGLVIHALSNKESH 

VPYHMSICLTYLLONSLGGSAKMLMFVNISPIiEEWVSBSLNSIiRP 

ASKVEPSVLFGTAQSNRKfrJKTDPDLCVCVCVCVCVCVCVCVCVP 

MSMYKVRGGR VAGGCFIG WRAPCPRAI K 


5457 


2 


1540 


DDFVERRRWTRTTCLVRS PPHVPVCGHACSWNGGSLDPLKGTPA" 

LLRSAERLMRKVKKI1RLDKENTGSWRSFSI4NSEGAERMATTGTP 

TADRGDAAATDDPAARFQVQKHSWDGLRSIIHGSRKYSGLIVMK 

APHDFQFVQKTDESGPKSHRX.YYLGMPYGSRENSLLYSEIPKKV 

RKEAIXIJ[.SWK0MLDHFQATPHHGVYSRBEB2J[/RBRKRI,GVFGr 

TSYDFHSESGLFLFQASNSLFHCRDQGKNGFMVSPGPGCVSPMK 

PXjEZKTQCSGPRMDPKXCPADPAFFSrimSDLWVmXErGSBR 

RLTFCHQGI*SNVI)DDPKSAGVATFVIQEEFDRFTGYWWCPTASW 

EGSEGLKTLRI LYEEYDESE VE VIHVPS PALEERKTDSy RYPRT 

GSlQ^PKIAIiKlAEFQTDSQGKIVSTQEKEtiVQPFSSLPPKVEYI 

ARAGWTRDGKYAWAMFUDRPQQWLQLVLLPPALFIPSTENEEQA 

ASLC0SCP0ECPAVCX5VRGGHCRLD0CS 


5458 


6642 


4022 


FVPGLREPQWEPAQPSATMSAPSEEEEYARtVMEAQPEWtiRAEV 
KRLSHELAETTREKIQAAEYGIiAVLEEKHQCjKIiQFEKLBVDYEA 
IRSEMEQLKEAFGQAHTNHKKVAADGESREESLIQESASKEQYY 
VRKVIiEI^TELKQLRNVLTNTQSENERI»ASVAQEIJCEINQWVEI 
QRGRLRDDIKEYKFRSARLLQDYSELEEENISLQKQVSVI^QNQ 
VEFEGLKHEIKRLEEETEYLNSQLEDAIRLKEISBRQLEEALET 
LKTEREQKNSIiRKEI/SHYMSINDSFYTSHLHVSLDGIiKFSDDAA 
EPNWDAEAIjVNGPEHGGLAia,PLDNKTSTPKKEGI*APPSPSX.VS 
DLLSELNXSEIQKt^KQQt^lQMEREBCAGLLATriQDTQKQriEHTRG 
SLSEQQEKVTRLTEWbSALRRLQASKERQTALDNEKDRDSHEDG 
DYYEVDINGPEILACKYHVAVAEAGELREQLKALRSTHEAREAQ 
HAEEKGRYEAEGQALTEJCV^LLEKASRQDRBDIARLEKSLKKVS 
DVAGETQGSLSVAQDELVTFSEELANLYHHVCMCNNETPNRVMI* 
JOYYREGQGGAGRTSPGGRTSPEARGRRSpriitiPKGLLAPEAGRA 
IK3G1X3DSS PS PGSSLPS PLSDPRREPKlrl Ym*I Al IRDQIKHdb 
AAVDRTTELSRQRlASQErjGPAVDKDKEALMEEIIiKLKSliiSTK 
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SEQ 
ID 
NO: 



54S9 



Predicted " 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



3XG 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



1262 



Ami.no acid segment containing sig nal pepti de 
(A^Alanrne, C=Cysteine. D-Aspartic Acid, 
Glutamic Acid. F-Phenylalanine, G=Glycine, 
H=Histidine, I^lsoleucine, K^Lysine 
I,=Leucine, M-Mechionine, N=AsparagiAe 
P::.Proline, Q=Glut amine, R-Arginine, 
SrrSerine, T-Threonine. V^Valine, 
W.Tryptophan, Y=Tyrogine, X«Unk^own, *=StOD 
Codon, /=possible nucleotide deletion " 



\:rpQssible nucleotide ins ertion) 
REQITTLRTVoKANKQTAEV AlJajbKSKYENhlKAMVrrETMMiq. -^ 
NELKALKEDAATPSSLRAMFATRCDEYITQLDEMQROLAAAaDE 
KKTLNSLLRMAIQQKIALTQRLELLELDHEQTRRGRAKAAPKTIC 
^ATPSVSmCACASORABGTGLANQV FCSBIQiSIYCD 
RGGHRLSGMASNFNDIVKQG Y VRIRSRRLGI ifUKCWLVFKKASS 
KGPKRLEKFSDERAAYFRCYHKVTEIiNNVKNVARLPKSTKKHAI 

GIYFNDDTSKTFACESDLEADEWCKVLQMECVGTRIlsmiSIiGEP 
DLriATGVEREOSERPWWrA'0<;T>MTy:ir'VMrTi?/^»T ^_ 



- - *^'^-^'**'^-=»'^«Kzi^KUTTWi'TFKAGRMCE!PGBGIiFIF 

QTRDGEAiyQKVHSAALAIAEOHERLL<JSVICHSMLQMKMSERAA 

f^o^^^'^^^^^"^^^^"^'^°^^^^°^5SPLKLHRTETF 
PAYRSEH 



54^1 



663 



RPGCRAGBLST6SRARERVRNRVSAPCGUUi>KKCUPEVJL.RGRSP 
GLGIAEMPSCGACTCGAAAVRLITSSLASAQRGISGGRIHMSVL 
GRLGTFETQ1LQRAPLRSFTETPAYPASKDGISKDGSGD6NKKS 
ASEGSSKKSGSGNSGKGGNQLRCPKCGDI^CTHVETPVSSTRPVK 
CEKCHHFFWLSEADSKKSIIKEPESAAEAVKLATOJKPPPPPK 
KIYNYU5KYWGOSPAKXVLSVAVYNHYW2IYNNIPANLRQOAE 
VEKOTSLTPREI^EIRRREDEYRFTKLIiQIAGISPHGNALGASMQ 
QQVNQQIPQEKRGGEVLDSSHDDIKLBKSKIIiLLGPtrcSGKTLL 
AQTIAKCLDVPFAICDCTTLTQAGYVGEDIESVIAKI.LQDANYN 
VEKAQQGIVFUDEVDKIGSVPGIHQLRDVGGEGVQQGLLKLLEG 
TIVNVPEKNSRKUiGETVQVDTTNILPVASGAFNGLDRI r SRRK 
NEKYLGFGTPSNLGKGRRAAAAADLANRSGESNTHQDIEEKDRL 
LRHVEARDLlEPGMIPEPVGRLPVWPLHSLDEKTLVQIIiTEPR 
NAVIPQYQALFSMDKCElWrEDAIiKAXARLALERKTGARGLRS 
IMBIO^LLBPMFEVPNSDJVCVEVDKirVVEGKKEPGYIRAPTJC'F*? 
SEEEYDSGVEEEGWPRQADAANS 



rNPPPPPKSFC;GRARKWRR RKKPGAPEAAV Mt ii.PSGPGPERLFD " 
SHRLPGDCFLLLVIJLLYAPVGPCLLVLRLFLGIHVFLVSCALPD 
SVLRRFWRTMCAVLGI^VARQEDSGLRDHSVRVLISNHVTPFDH 
NIVNLLTTCSTPLLNSPPSFVCWSRGPMEMNGRQEDVESLKRFC 
ASTRLPPTPLLLFPEEEATHGREGLLRFSSWPFSIQDWQPLrL 
QVQRPLVSVTVSDASWVSELLWSLFVPFTVYQVRWLRPVHROLG 
EANBEFAliRVQQLVAKELGQTGTRLTPADKAEHMKRQRHPRLRP 
QSAQSSFPPSPqPSPDVQLATUVQRVKEVLPHVPLGVIQRDLAK 
TGCVDLTITNLLEGAVAFMPEDITKGTQSLPTASASKFPSSGPV 
TPQPTALTFAK3SWARQESLQERKQALYEYARRRFTERRAOEAD 



237 



KIKERQMSANKSPPSAQKSVLPTAlP AVI.PAASPCSSPKTGIiSA " 
RLSNGSFSAPSLTNSRGSVH1?VSPLLQIGI,TRESVTIEAQELSL 
SAVKDLVCSIVYQKFPECX^FFG^m)KII*LFRHDMNSE^^lLOLIT 
SADEIHEGDLVEWLSALATVEDFQIRPHTLYVHSYKAPTFCDY 
CGEtiLWGhVROGLKCEGCGLNYHmChFKXPmCSGVRKRRLSU 
VSLPGPGLSVPRPLQPEYVALPSEESHVHQEPSKRIPSWSGRPI 
WMEK^WICRVICVPHTFAVHSYTRPTXCQYCKRLLKGLPRQGMQC 
KDCKPNCHKRCASKVPRDCU5EVTFNGEPSSLGTDTDIPMDIDN 
NDINSDSSRGLDDTBEPSPPEDKMFFLDPSDLDVERDEEAVKTI 
SPSTSIWIPlWRWQSIKHTKRKSSTMVKEGWmjYTSRDNLRK 
RHYWRLDSKCbTI.FQNESGSKYYKEIPLSEIIJ?ISSPRDPTNIS 
QGSNPHCFEIITDTMVYFVGENNGDSSHNPVXAA'TCVGLDVAOS 
WBKAIRQALMPVTPQASVCTSPGQGKDHKDLSTSISVSNCOIOE 
^^rSTVYQXFADEVX^SGQFGIVYGGKmKTGnDVATKVIDm 
RFPTKQESQLRNEVAILQNLHHPGIVm.ECMFETPERVFVVMEK 
LHGDMLEMILSSEKSRLPERITKFMVTQILVALRNLHPKNIVHC 
DI.KPENVLI1ASAEPFPOVKLCDFGFARI 1GEKSFRRSWX5TPAY 
LAPEVLRSKGYNRSLDMWSVGVrtYVSLSGTFPFNEDEDINDQI 
OKTAAFMYPPNPMREISGEAIDI^INNLLQVKMRKRYSVDKSLSHP 
WLQDYQTWLDUIEPETRIGERYITHESDOARWEIHAYTHNLVVP 
KHPIMAPWP0DMEEDP 



LLSVTMTTSRCSHLPBVLPDCTSSAAPW Kl-VKDCGSLVNGQPt 
YVMQVSAKDGQtJ^STVVRTLATQSPFNDRPMCRICHEGSSQEDL 



327 



3NSDOCID: <WO^ 
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SEQ 
ID 
NO: 



5465 



5466 



5467 



f redicted 
beginnincf 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
f nucleotide 
locettion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



19S 



€77 



5278 



3348 



352 



2103 



5468 



225 



2976 



Amino acid segment containing signal peptidLe 
(Ax-Alanine, C==Cysteine, D^Aspartic Acid, E= 
Glutamic Acid. F^Phenylalanine, G=Glycine, 
H=Histidine, I«:lsoleucine, K-Lysine, 
L= Leucine, M^Methionine, N^sAsparagine , 
P« Proline, Q=Glutamine, R:=:Arginine, 
S^Serine, T^Threonine, V=Valine, 
M=rryptophan, Tf-TyroBinc, X«Unknown, *=stop 
Codon, /-possible nucleotide deletion, 



\==possible nucleotide insertion) 
lspcectgtlgtihrsclehwlsssntsycku:hfrkaverk4»r' 

PLVEWLRNPGPQHEKRTLFGDMVCFLPITPIiATISGWLCU^GAV 
DHLHFSSRLEAVGLIALTVALFTIYLPWTIiVSFRyHCRLYNEVm 
RTOQRVXLLIPKSVWVPSNQPSL LGmsVKRNSKETW 



SPSMWPRKKVQLKliIIVG AiOVGKTSLLHO^VHKTFYEEYQTTIi 
GAS ILSKI I ILGDTTLKU3I WDTGGQERVRSMVSTFYKGSDGCI 
IAFDVTDLESPEAI,DIWRGDVLAKIVPMEQSYPMVLLGNKIDLA 
DRKYQSILE^fHtlTEsrKLSPDQSRSRC C 

KUDPREFIRVHREALECDYVSAHLHEKIDLIPGYlCQQGPA^^ 
VNrVFHHLPYEGQVDIYNINDPLKETATIGFIMNEX3QIPKQLFKK 
PHPPKRVRSRLNGDNAGISVLPGSTSDKIFFHHLDNLRPSLTPV 
KELKEPVGQIVCTDKGILAVEQWKVlilpPTWNKTPAWGYADLSC 
RLGTYESDKAMTVYEC3jSEWGQILCA1CPNPKLVITGGTSTWC 
VWEMGTSKEKAKTVTIJCQALIiGHTDTVTCATASLAYHIIVSGSli 
DRTCIIWDLWKLSFLTQLRGHRAPVSALCIKEL1X3DIVSCAGTY 
IHVWSINGNPIV/SVNTFTGRSQQIICXICMSEMNEMDTQMVIVTG 
HSDGWRPWRMEFLQVPETPAPEPAEVLEMQEDCPEAQIGQEAQ 
DBDSSDSEADEQSISQDPKDTPSOPSSTSHRPPAASnttA'raaw/- 



A uii^ttUUHRKWSDQIiSLDEKDGFI FVNYSEGQTRAHIKJGPLSHP 
HPNPlEVRNYSRLKPGYRWERQLVFRSKIiTMHTAFXtRKDNAHPA 
BVTALGISKDHSRILVGDSRGRVFSWSVSDQPGRSAADHWVKDE 
GGDSCSGCS VRFSLTERRHHCRNCGQLFCQKCSRPQSEI KRIOCI 
SSPVRVCQNCYYNLQHERGSEDGPRHC 

iiACAHASAHASGRLVRWWIUaeRSVMG iqTSPVLIASJX^GLVTi; ' 
LGLAVGSYLVRRSRRPQVTliLDPNEKYLLRLLDKTTVSHMTKRP 
RFALPTAHHTLGIiPVGKHlYI^STRIDGSLVXRPYTPVTSDEDQG 
YVDLVIKVyr.KGVHPKFPEaGKMSQYIiDSI»KVGDWEPRGPSGL 
LTYTGKGHFNIQPNKKSPPEPRVAKKLGMIAGGTGITPMU3LIR 
AILKVPEDPTQCFUCiFANQTEKDIILREDLEEU3ARYPNRFKLW 
FTI,DHPPKDWAYSKGFVTAOMIREKL?APGDDVLVU^CGPPPMV 
QLACHPNLDKLGYSQKMRFrr 

GEAI.RVGTRGCRRCLPDPQARIFIQKkDLEBDESVTAAHLKSRQ" 



z xujvtvijan lortjji i^ijAUSS XSKPDVI TLLEQEKBPWMVVRKETS 
RRYPDLBLKYGPEKVSPENDTSEVNIiPKQVIKOlSTTIiGIEAFy 

FRNDSEYRQFEGLQGYQEGNINQKMISYEKliPl*HTPHASLICKT 
HKPYECKECGKYF.qrnflMT.TnHnc Ttrprrwow/'trT^/^tir* tt 



A wx* 1 HJTw^r « A ijisit ru ja cjufiijUKAFKI^PTQtNRHKNIHTVKKLP 

ECKECGKSFNRSSNLTQHQSIHAGVKPYQCKECGKAFNRGSNLI 

QHQKIHSNEKPPVCfCBCGMAFRYHYQL lEHCOIHTGBKPPECKB 

CGKAPTLLTKLVRHQKIHTGEKPFECREOSKAFSI^LNQLNRHKN 

IRTGEKPPECKECGKSFNRSSNLVQHQSIHAGIKPYECKECGKG 

FNRGAHLIQHQKIHSNEKPFVCRECEMAFRYHCQLXEHSRIHTO 

DKPFECQDCXSKAFNRGSSLVQHQSIHTGEKPYECKECGKAFRLY 

LQLSQHQKTHTGEKPFECKECGKFFRRGSNIiNQKRSIHTCKKPF 

ECKECGKAFRLHMHLXRHQKLHTGEKPPECKEOGKAFRr«MQLl 

RHQKLHTGEKPFECKECGKVFSLPTQLNRHKN IHTGEKAS 

SFljTCDLFQSIiAQLENIiCKQLYBTTDTT TRLQAEKAI.VEFTNSPD 

CLSKCQLLLERGSSSYSQLLAATCLTKIiVSRTNNPLPLEQRIDI 

RNyvmyixATRPKIiATFVTQALIQLYARITKLGWFDCQKDDYVP 

RNA.TTDVTRFLQDSVEYCIIGVTILSQLTNEINQVSATAPLIEA 

DTTHPLTKHRKIAS S FRDS SLFDIFTIiSCNLLKQASGKNLNU(D 

ESQHGLLMQLLKLTHNCLNFDPIGTSTDESSDDIiCTVQIPTSV/R 

SAFLDSSTLQXjSTIGRCEYEKTCALIiVQIjFDOSAQSYQELLQSA 

SASPMDIAVQEGRIiTWIiVYI IGAVIGGRVSFASTDEQDAMDGEI, 

vcrvlqlmnltdsrlaqagneklelamlsffeqfrkiyigdqvq 
kssklyrrlsevrxslndetmvlsvfigki itwlkywgrcbpits 
ktlqllndls igyss vrklvklsavqfmlnmhtsehps fusinn 

QSWLTDMRCRTTFYTALGRrJJIVDLGEDEDQYEQFMLPLTAAFE , 
AVAQMFSTNS FNEQEAKRTLVGLVRDI.RGI APAFNAKTS FMMLlf 
EWI YPS YMP II^RAIELWYHDPACTTPVLKLWAELVHNRfiQRLQ 
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SEO 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
reoidue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide ' 
(A=:Alanine, C=Cysteine, D^Aspartic Acid, e- 
Glutamic Acid, F= Phenylalanine, G-Glycine 
H«Histidine, I^Ieoleucine, K=.Lysine^ 
L=Leucine, M^Methionine, N=Aeparagine, 
P=Proline, Qt^^Glutamine, R=*Arginine, 
S -Serine, T^Threonine, V= Valine, 
W=Tryptophan, Y=:Tyrosine, X-unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\-posslble nucleotide insertion) 






j FDVSSPNGILX.FT^ETSKMITMYGNRllTLGEVPKIX5VYiUM;KG^ 
IS ICFSMLICAALSGSYVNFGVFRLYGDDALDNALQTFI KljLlJsi 
PHSDLLDVPKLSQSYYSLUEVLTQDHMNFIASLEPHVIMYIEjSS 
ISEGtiTALDTMVCTGCCSCLDHIVTYLFKQI,$RSTKKRTTPDNQ 
ESDRFLHIMQQHPEMIQQMLSTVIJ^IIXFEDCRMQWSMSRPLLG 
LiLLNEKYFSDIiRNSIVNSQPPEKQQAMHLCFENLMEGIERNLL 
TKWRDRFTQNLSAFRREVNDSMKNSTYGVNSNDMMS 


5469 


134 


2653 DQEFETSLVPWHLPMGWLCSGLLFPVSCLVLLQVASSGNI^KVL^ 
EPTCVSDYMS ISTCEWKMNGPTNCSTELRLLYQt.VFLr,SEAHTC 
VPENNGGAGCVCHta^DVVSADNYTLDLWAGQQrjIiWKGSPKPS 
EHVKPRAPGNLTVHTNVSDrLLLTWSNPYPPDNYLYKHLTYAVW 
IWSENDPADFRIYNVTYLEPSUlXAASTLKSGISyRARVRAWAQ 
CYir?TWSEWSPSTKWHNSYREPFEQHEj:JXJVSVSCIVIIAVCLL 
CYVS r TKI KKEWWDQI PNPARSRLVAII lODAOGSOWPK-D conn 

EPAKCPHWKNCLTKLLPCFI^EHNMKRDEDPHICAAKEMPFQGSGK 
1 SAWCPVEISKTVLWPES1SVVRCVEI,?EAPVECE2EEEVEEEKG 
SPCASPESSRDDFQEQREGIVARLTESIjFrjDLLGEENGGFCQQD 
MGESCLLPPSGSTSAHMPWDEFPSAGPKEAPPWGKBQPLHLEPS 
PPASPTQSPDNLTCTETPLVIAGNPAYRSPSNSLSQSPCPREliG 
PDPLIiARHLEEVEPKMPCVPQLSEPTTVPQPEPETWEQILRRNV 
LQHGAAAAPVSAPTSGYQBPVHAVEQGGTQASAWGLGPPGEAG 
J VKAPSSLIASSAVSPEKCGFGASSGEEGYKPFQDLIPGCPGDPA 
PVPVPLFTFGLDREPPRSPQSSHLPSSSPEHLGLEPGEKVEDMP 
KPPI,PQEQATDPI.VDSLGSGXVYSALTCEir,OGI{LKQCHGQEDGa 
QTPVMASPCCGCCCGDRASPPTTPLRAPDPSPGGVPLEASLCPA 
SUVPSGISEKSKSSSSPHPAPGNAOSSSQTPKIVNFVSVGPTYM 


5470 


17 


1413 TACRIRTSLWRGIAAVKK0AVfc;MCASYGtAySLMKFFT*SpMSDF 

KNVGLVFVNSKRDRTKAVLCMWAGAIAAVFHTLIAYSDLGYYI ' 
INKLttHVDESVGSKTRRAFLyLAAFPPMDAMAMTHAGILLKHKY 
SFLVGCASlSDVIAQWFVAri:,3:,HSHLBCREPLLrPILSr,YMGi^ 
LVRCTTLCLGYYKNIHDIIPDRSGPELGGDATIRKMLSFWWPLA 
LILATQRISRPIVNLFVSRD1X3GSSAATEAVAILTATYPVGHMP 
YGWLTEIRAVYPAFDKNNPSNKLVSTSNTVTAAHIKKPTFVa-lA 
LSLXLCFVMFWTPNVSEKILIDIIGVDPAFAELCWPI^RIFSFF 
PVPVTVRAHLTGWLMTLKKTFVLAPSSVXiRIIVLIASLVVLPYL 
GVHGATLGVGSLLAGPVGESTMDAIAACYVYRKQKKKMENESAT 
1 EGEDSAMTDMPPTBEVTDIVEMREENE 




5471 


18^8 


658 RSSAPPGPQRAAAATAAAAAAGVEMAAAAAQGGGGGEPRRTEGV ' 
GPGVPGEVEMVKGQPFDVGPRYTQU3YIGEGAYGMVSSAYDHVR 
KTRVAIKKISPFBHQTYCORTLREIQILIJIFRHENVXGIRDILR 
ASTLEAMRDVyiVQDLMETDLYKLLKSQQLSNDHICYFLYQILR 
GLKYIHSANVLHRDLKPSNLLINTTCDLKICDFGIARIADPEHD 
1 HTGPLTEYVATRW YRAPEIMLNSKGVTKS IDIMS VGCILAEMLS 
NRPIPPGKHYLDQIJraiI/?TI^SPS0KDlJ^CIINMKARimK3SL 
PSKTKVAWAKLFPKSDSKALDIiLDRMLTFNPNKRITVEEAIiAHP 
YLEQYyDPTDEPVAEEPFTFAMELDDLPKERLKELIFQETARFO 
PGVLEAP 




5472 
5473 " 


1465 


753 I'YVWARYLSDEkVAVSIDRLCKANGRSPSIPFGTVRIPGRARVR 
DPQALWIPGYGSLVWRPDFAYSDSRVGFVRGYSRRFWQGDTFHR 
GSDKMPGRWTLLEDHEGCrWGVAYQVQGEQVSKALKYLNVREA 
VLGGYDTKEVTFYPQDAPDQPLKAXAYVATPQNPGYLGPAPEEA 
lATQIIACRGFSGHNLEYLIiRVRDVMQLCXSPQAQDEHIAAIVEA 
VGTMLPCPCPTEQALALV 






3 


2113 FMm^LIQDt£DIEQRVPVMDAQYKIITKTAHLITKE^PQfeEG 
KEMFATMSKLKEQLTKVKECYSPLLYESQQLLIPLBEIiEKQMTfl 
FYDSLGKtNElITVLEREAQSSALFKQKHQELLACQENCKKTLT 
LIEKGSQSVQKWrLSNVLKHFDQTRLORQIADIHVAFQSMVKK 
TGDWKKHVETWSRLMKKFEESRAELBKVIiRIAQEGXiEEKGDPBE 
LLRRHTEFFSQIJDQRVZ.NAFLKAa)ELTDILPEQEQQGLQEAv{l 
1 KLHKQWKDLQGEAPYHLLHLKIDVEKNRFLASAEECRTEIiURET ( 
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SEQ 
ID 
NO; 


predicted 

beginning 

nucleotide 

location 

corresponding 

to first 

residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acxa segment containing signal p^ptiH^ 
(A.Alanxne, C^Cysteine, D.Aspartic AcidT E=. 
Glutamic Acid. F=Phenylalanine, G^-lycine, 
H=Histidzne, I=Isoleucine, K-iLysine, 
L^^Leucine, M=»Methionine, H»A3p^reigine, 
P=Prolane, Q-Glutamine, R=Arginine 
S-flerine, T=Threonine, V« Valine. 
W-Tryptophan, Y-Tyrosine, X-UnJcnown, *=st:on 
Codon, /^.possible nucleotide deletion 
\=possible nucleotide insertion) 


5474 






RDPVRDTPGTCHVTLKBLRAAIDSTYRKLMEDPDKWKDYTS^^ 
EFSSWISTNBTQUJCGIKGEAIDTANHGBVKRAVEEIRNGVTtOiG 
in-LSWLKSRLKVLTEVSSENEAQKQGDELAKLSSSFKALVT^S 
EVBKMLSNFGDCVQYKEIVKNSLEELISGSKEVQEQAEKILDTE 
NLFEAQQLLLHKQQKTKRISAKKRDVQQQIAQAQQGEGGtPDRG 
HEELRKLESTLDGLERSRERQERRIQVTLRKWERPETNKETWR 
YLPQTGSSHERPLSPSSLBSLSSELEQTKEFSKRTBSIAVQAEN 
r.VK£:ASEIPLGPQNKQLLQQQAKSIKBQVKKLEDTLEEEYVIDK 




2 


780 


TPDVRui,uASKRGlAVASWCSPKWFAGEEMA]^kSljWLUlQSTl 
rJOiWKKtWFDLWSDGHLlYYDIXJTRQNIEDKV^ 

QECmtQPPl^KSKDCMLQXVCRJXSKTlSLCASSTDXiCLAnKPr 
I^DSRTNTAYVGSAVMTDETSWSSPPPyTAYAAPAPEVGRTLS 

PAKOVXIRERYRDNDSDlJVUSMIAGAATra^aTrcT iTunm^^^^ 1 


S4/b 192 


506 


VSQKNMEDYLQAI^lSlAVRKIAI,r.LKPDKE?EHS^^S?J^S 
STFR»nrrVQFDVGVEFEEDI.RSVDGRKCQTIVTMEEEHI,VCVQK 
GEVPNRGWRHWLEGEMLYLELTARDAVCEQVFRICVR 




1457 


no™!^^^^"^'^^'^'^^^^^^^^^QS^TSIHQYLVDEPT^^ 
PSTRASEVLCStm^SHYELQVElGRGPDNLTSVHIJtfUiTPTGTL 
VTIKITNLENCNEERLKALQKAVILSHFPRHPNITTYWTVFTVG 
SWI,WVISPF^^AyQSASQLLRTYFPEG^lSElT.TRWT-»W'nvt9AT^ 

AVYDFPQFSTSVQPWlJSPELLRQDUtGYNVKSDIYSVGITACEL 
ASGQVPFQDMKRTQMLLQKIiKGPPYSPLDISIFPQSBSRMKKSO 
SGVUSGIOESVX^VSSGTHTVNSDRLHTPSSKTFSPAFFSLVQLC 
LQQDPEKRPSASSLI.SHVFFKQMKEESQDSILSIAPPAYNKPSI \ 
SLPPVLPWTEPBCDt^TOiiKDSyWEF ^^^NK^SZ f 


5477 j 3 
5478 2 


1044 


KUNaHLRYSHEbKLQLPRLPELPETGRQLbUH VE VATEPAGSR r 
VQBKVFKGLDLLEKAAEMLSOLDLFSRWEDLEEIASTDLKYLLV 
PAFQGALTMKQVNPSKRLDHLQRAREHFINYLTQCHCYHVAEF^ 
LPKTMWKSAENHTAlTSSMAYPSLVAMASgRQAKlQRYKQKKELE 
HRLSAMKSAVESGQADDERVREYYIJCaHLQRWIDISLEEIESIDQ 
EIKILRERDSSREASTSNSSRQERPPVKPPILTRNMAQAKVFGA 
GYPSLPTMrVSDWYEQHRKYGALPDQGIAKAAPEEFRKAAQQQE 
EQEEKEEEDDEQTLHRAREWDDWKDTHPRGYGNROMM« 


S479 2 


835 


KavKXWPm/KGb'^rVFRAHTATVRSVHFCSDGOSFVTASPDKT 
VXVWATHRQKFLFSLSQHINMVRCAKFSPDGRLIVSASDDKTVIC 
liWDKSSRECnraSYCEHGGFVTYVDFHPSGTCIAAAGMDNTVKVW 
DVRTHRLLQHYQLHSAAVNGLSFHPSGNYLITASSDSTLKILDL 
ME6RLLYTLHGHQGPATTVAPSRTGEYFASGGSDBQVMVWKSNP 

LENQQLTO^TP^^^"^^^"^"^^^ 


^4An ( 5rrr — 


83 5 

J 
I 
1 
] 


KTVRXWVPNVKGKSTVFRAHTATVRSVKFCSDGQSFVTASDDKT 1 
VKVWATHRQKFLFSX>SQHINVTOT?fWv KPS PnaPT.TUQx onni^ 

bWDKSSRECVHSYCEHGGFVTYVDFHPSGTCIAAAGMDNTVKW; 

OVRTHRIXOHYOLHSAAVNGLSPHPSGNYLITASSDSTUCIU)!, 

-lEGRl^YTLHGHQGPATTVAFSRTGEYFASGGSDEQVMWKSNF 

31GDHGEVTKVPRPPATIASSMGNLTVSILEQRLTLEEDKLKQC 
jENQQLIMQRATP ^'i^i^x^^wv. 






iys2 ■■ I 
c 
z 

£ 
F 
K 
E 
H 


.SLTSRMEEAELVKGRI^AITDKRKIQEHtSQKRLKIEEDICLKH 1 

JHLKKKALREKWLLDGISSGKEQEEMKKQNQQDQHQIQVLEQSI 

JU^KEIQOI^fCAELQrSTKEEArLKKLKSlERTTEDIXRSVKV 

niEERAEESIEDlYANIPDLPKSYIPSRLRKEINEEKBDDEQNR 

MYAMEIKVEKDLKTGESTVLSSIPLPSDDFKGTGIKVYDDGQ 

:SVYAVSSNHSAAYWGTDGrAPVEVEEI^QASERNSKSPTEYH 

PVYANPPYRPrrPQRETVTPGPKFQERIKIKTTfGLGIGVNBSi 

:rfMGNGLSEERGWNFKHISPlPPVPHPRSVIQQAEEKI»HTPOKl | 
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ID 
WO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
eunino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


.w t^^M^ iJcyuieuL v,tjuuaixij.ng Signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, Phenyl alanine, G«Glycine, 
H=Histidine, I-Zsoleucine, K=»I*ysine^ 
L=Leucine, M=Methionine, N=.Asparaoine, 
P=E>roliue, Q=Glutamine, R^Arginine^ 
S^Serine, T=Threonine, V^V&line, 
W-Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, ■ 
\-possible nucleotide Insertion) 


5481 






bMTi^WKhiSNVMgoKDAPSPKPRLSPRBTlKUK^EHQNSSPTCQE" 
DEEDVR YN I VHS L P P D INDTEP VTMT PMG vnnapn q p t? vt.^ m 

GinxSIIHAELWlDDEEEEDEGEAEKPSYHPlAPHSOVyoPAKP 
TPLPRKRSEASPHEKHKS 


5482 


3 


1422 


NS PGSVCliCQCVGPSLLHCIiPPLLLtiLIiLPLLIiHESPQPPALRV 
VATSSDRNFMNKHQKPVLTGQRFKTRKRDEKEKPEPTVPRDTLV 
QGLMEAGDDLEAVAKFLDSTGSRLDYRRYADTLFDILVAGSMLA 
PGGrRIDDGDKTKMTKHCVFSANEDHETIRNVAQVFNKLIRRYK 
YLEKAFBDEMKKLbLPLKAFSETEQTKLAMLSGIIiLGNGTIiPAT 
iiJoiJvi^iLiji>u\sf7\vKltFKAWMAEKIDANSVTSSIjRKAW 
LDKRLLELFPVNRQSVDHFAKYFTDAGLKEIiSDFLRVQQSLGXR 
KEIiQKEKyERLSQECPXKBWLYVICEEMKRNDLPETAVIGbLWT 
CIMNAVEWNKKEELVAEQALKHLKQYAPLLAVPSSQGQSELJI^ 
QKVQEYCYDWIHFMKAFQKIWLFYKADVLSEEAII^KWYKEAHV 
AKGKSVFLDQMKKFVEWLQNAEEESESEGEEN 




14^2 


528 


EGLQEKDSGPYSCSVHVQDKQGKSRGHSIKTLELNVLVPPAPPS 

CRLQGVPHVGANVTLSCQSPRSKPAVQYQWDRQIiPSFQTFFAPA 

LDVXRGSLSLT^^:,SSS^IAGVYVCKAHNEVGTAQCNVTLEVSTGP 

GAAWAGAVVGTIiVGIiGLlAGLVl,UYHRRGKALBEPANDIKEDA 

lAPRTLPWPKSSDriSKWG'IXSSVTSARALRPPHGPPRPGALTP 

TPSLSSQALPSPRLPTTDGAHPQPISPIPGGVSSSGLSRMGAVP 
VMVPAQSQAGSLV 


S483 
5484 


1 


788 


FFFFKGCRAGRQNESDYRia*EEMHQRFt.VSERSKDDLQLRLTRA 
ENRIKQLETDSSEEISRYOEMIQKLQNVLBSERENCGliVSEQRL 
KLQQENKQLRK^TESLRKIALEAQKKAKVKISTMEHEFSIKERG 
FEVQLREMEDSNRNS I VELRHLLATQQKAANRWKEETKK1.TESA 
FlRTNNLKSELSRQKIJrrQEIJ^SQI»EMANEKVAENEKLILEHQE 
KANRl^QRRLSQAEERAASASQQLSVITVQRHKTUVSLMNLBNI 




3 


1^97 


IMADMBDLFGSDADSEAERiCDSDSGSDSDSDQENAASGSNASGS ' 
ESDQDERGDSGQPSNKELFGODSEDEGASHHSGSDNHSERSDNR 

w**»xiivo L/no V ijyn:>ijieAPNDDEDEGHRSIXK5 
AEGSEKAHSDDEKWGRBDKSDQSDDEKrQNSDDEErRAQGSDEDK 
LQNSDDDEKMQNTDDEERPQI^DDERQQLSEBEKANSDDERPVA 
SDNDDEKQNSDDEEQPQIiSDeEKMQWSDDERPQASDEEHRHSDD 
EBSQDHKSE3ARGSDSEDKVhmKRmAIA3X>SEM)SDTEVPKD 
NSGTMDLFGGADDISSGSDGEDKPPTPGQPVDENGDPQDQQEEE 

PIPETRIEVEIPKVNTDLGNDLYFVKLPNFLSVEPRPFDPQYYE 
DEFEDEE^^JDEEGRTRLKLKVEl^^IRWRIRRDEE^3^JETlr)^cKrn« 

IVKWSDGSMSLHLGKEVFDVYKAPLQGDHNHLFIRQGTGLQGQA 
VFKTKLTFRPHSTDSATHRKMri.SXADRCSKTQKIRa:r.PMAGRD 
PECQRTEMIKKEEERLRAS IRRESQQRRMREKQHQRGLSASYLE 
PDRYDEBEEGEESrSLAAIKKRYKGGIREERARIYSSDSDEGSE 

EDKAQRLLKAKKLTSDEVRPIJLFMSRGLSCTQEPTALNEELTDQ 
AGTN 


54 85 
54B6 


161 


1074 


KRKILSSMMDSEAHEKRPPILTSSKQDiSPHITNVGEMKHYUIG 
CCflAFNMVAITFPIQKVLFRQQLYGllCTRDAlLQLRRDGFRWI,y 
RGILPFLMQKTTTLALMFGLYEDLSCLLHKHVSAPE FATSGVAA 
VIAGTTEAIFTPLERVQTLLQDHKEIHDKPTNTYQAFKALKCHGI 
GEYYRGLVPII>FRNGLSNVLFFGtJ?GPIKEHLPTATTHSAHLVN 
DFICGGLLGAMLGFLPFPINWKTRIQSQIGGEPQSFPKVFQKI 
f^LERDRKLXNLFRGAHIiNYHRSLlSWGI INATYEFLtiKVI 




1404 


142 

< 

3 
3 
I 
I 


X PGSTISWSPAAARGi.SVCHCCRLHPASAMDI,FGDI.PEPERSPR 
PAAGKEAQKGPLI^FDDbPPASSTDSGSGGPLLFDDLPPASSGDS 
SSIATSXSQMVKTEGKGAKRKTSEEEKNGSEELVEKKVCKASSV 
CPGLKGYVAERKGERBEMQDAHVru^DITEECRPPSSLITRVSY 
^'AVFDGHGGIRASKFAAQNLHQNLIRKFPKGDVlSVEKTVKRCIi 
^DTFKHTDEBFIiKQASSQKPAWKDGSTATCVUWDWrtiYIANIiG 
5SRAILCRYNEESQKHAALSLSKEHNPTQYEERMRIQKAGGNVR 
XSRVLGVLEVSRSIGDGQYKRCGVTSVPDIRRCQLTPNDRFljfej 
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SEQ 
ID 
NO; 


Preaicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
res-due of 
amino acid 
sequence 


Amino acid segment containing signal peptide^ 
(A=Alanine/ C~Cy*steine, D=Aspartic Acid, 
Glutamic Acid, F=Phenylalanine, Gs=Glycine, 
H=Histidine, I-Isoleucine, K=Lysine, 
L=ljeucine, M=Methionine, N=Asparagine, 
PsProline, QsGlutamine , R=Arginine, 
S«Serine, T=Threonine, V»Valine, 
W=Tryptophan, Y=Tyrosine, X=rUnknown, *==Stop 
Codon, /^possible nucleotide deletion, ' 
\»possible nucleotide insertion) 








ACDGLFKVFTPBEAVWFILSCIiEDBKIQTREGKSAADARYESg:^ 
NRIiANKAVQRGSTVDNVTVMWRIGH 


5487 


535 


182 


AVSLEQIRGLQTPAPVPJUPLQPCPSNCDMERVTLALIXIAGLTA " 
LEANDPFANKDDPFYYI>WKNLQLSQLICGGIiLAlAGXAAVLS<3K 
CKCKSSQKQHS P VPEKAI PLITPGSATTC 


5468 


1072 


2S9 


AMAASGEPQRQWQEEVAAWWGSCMTDLVSLTSRLPKTGETIH 
GHKFFIGFGGKGANQCVQiAARLGAMTSMVCKVGKDSFGNDYIBN 
LKQNDISTE PTYQTKDAATGTAS 1 1 VNNEGQNI I VIVAGANXiLI. 
NTEDLRAAANVI SRAKVMVCQLE ITPATSLEALTMARRSGVKXIi 
FNPAPAXADLDPQFYTLSDVFCCNESE/VE ILTGliTVGSAADAGE 
AALVI»LKRGCQVyiITL.GASGCVVLSQTEPEPKHIPTEKVKAVD 
TTVSFKI 


S489 


81 


893 


GKGPVAAFIDQSNIFLTPP:<IFLGQWREEPKMPU,LLCJ^TEPtJC " 

LERDCRSPVEPWAAASPDLALACLCHCQDLSSGAFPNRGVLGGV 

LFPTVEMVIKVFVATSSGSIAIRKKQQEWGFLEA>JKIDFKEIiD 

IAGDEDNRRWMREWVPGEKKP0NG1PLPPQIFNEEQYCX5DFDSF 

FSAKEENIIYSFLGLAPPPDSKGSEKASEGGETEAQKEGSEDVG 

NI.PEA0EKNEEEGETATEET3E1AMEGABGEAEEEEETAEGESP 

GEDEDS 


5490 


81 


893 


GKGPVAAFIDQSNIFLTDPKIFLGQWREEPKMPLIiIJjG^^EPLK' 

JuERDCRS PVEPWAAASPDIiAlACLCHCQDtiSSGAFPNR(5VLGGV 

LFPTVEMVIKVFVATSSGSIAIRKKQQEWGFLEANKIDFKEIiD 

lAGDEDNRRWMRENVPGBKKPQNGlPUPPQIFNEEQYCGDFDSP 

FSAJCEENIIYSFLGIAPPPDSKGSEKAEEGGETEAQKBGSEDVG 

NLPEAQEKNEEEGETATEETEEIAMEGAEGEAEEEEETAEGEEP 

GEDEDS 


S491 


204 


1194 


GSAPRJUSLGPTGAQARDPDWWARPPSRP YTQSKEDRPDTEGRSE 
QGDMASSFLPAGAIl'GDSGGELSSGDDSGEVEFPHSPEIEETSC 
LAELPEKAAAHLQGIiIQVASREQIJ:.YLVARYKQVKVGWCNTPKP 
SFFDFEGKQKWEAWKAT^DSSPSQAMQEYrAVVKKLDPGWNPQI 
PEKKGKEANTGFGGPVISSLYHEETIREEDKNIFDYCRENNIDH 
ITKAlKSKNVDVNVKDEEGRALLHVgACDRGHKEIiVTVLUJHRAD 
INCQDNEGQTALHyASACEFLDIVErjLLQSGADPTJaRDQDGCIiP 
EEVTGCKTVSLVLQRHTTGKA 


5492 


3 


1896 


ASKWPt^AVCTTGIMSSLAVRDPAMDRSliRSVFVGiJlPYEAt^fi 
QLKDIPSEVGSWSFRLVYDRETOKPKGYGFCEYQDQETALSAM 
RNLNGREFSGRALRVDNAASEKNKEELKSLGPAAP I IDSPYGDP 
IDPEDAPES I TRAVASLPPEQMFELMKQMKIiCVQNSHQBARNMli 
IXJNPQrAYALC^SAQVVMRIMDPErALKILHRKIHVTPLrPGKSQ 
SVSVSGPGPGPGPGLCPGPNVLIiNQQNPPAPQPQEIARRPVKDI 
PPtiMQTP lOGGI PAPGP I PAAVPGAGPGSLTPGGAMQPQLGMPG 
VGPVPX.ERGQVQMSDPRAP I PRGPVTPGGLPPRGIit^DAPNDPR 
GGTrlLSVTGEVEPRGYLGPPHQGPPMHHASGHDTRGPSSHE^^RG 
GPLGDPRLLIGEPRGPMIDQRGLPMDGRGGRDSRAMETRAMETE 
VLEraVMERRGMETCAMETRGMEARGMDARGLEMRGPVPSSRGP 
MTGGIQGPGP INIGAGGPPQGPRQVPGISGVGNPGAGMQGTGIQ 
GT6MQQAQ IQGGGMQGAG IQGVS IQGGG lOGGGI QGAS KQGGSQ 
PSSFSPGQSQVTPQDQEKAAtilMQVLQIiTADQlAMLPPEQRQSI 
IiILKEQIQKSTGAS 


5493 


1 


1876 


RAPMMTKAVPESPRKPGRLTQALNSPLTWEHVWICVPGGTPDCI*" 

TDTFRVKRPHIiRRSASNGHVPGTPVYREKBDMYDEIIELKKSLH 

VQKSDVDIMRTXLRRLEEENSRKDRQIEQIiLDPSRGTDFVRTLA 

EKRPDASWINGt^KQRILKLEQQCKEKDGTISKLQTDMKTTNLK 

EMRIAMETYYEEVHRLQTLIASSETTGKKPiiGEKlCTGAKRQKK^ 

GSALLSLSRSVQELTEENQSLKEDLDRVLSTSPTISKTQGYVEW 

SKPRLLRRIVELEKKLSVMESSKSHAAEPVRSHPPACIASSSAL 

HRQPRGDRNKDHERLRGAVRDLKEERrALQE<2LI*QRDl»BVKQLL 

QAKADLEKELECAREGEEERREREEVL.REEIQTt>TSKlKJELQBM 

KKEEKEDCPEVPHKAQELPAPTPSSRHCEQDWPPDSSEEGLPRP 

RSPCSDGRRDAAARVLQAQWKVYKHKKKKAVLDEAiiVVLQAAF'fe 
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SEQ 
ID 
NO: 


Preciicted 
beginning 
nucleotide 
location 
ut^xx Boponuzng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine^ C- Cysteine, D=Aspartic Acid, E*= 
Glutamic Acid, F=^ Phenyl alanine, G=Glycine, 
H=Histidine, Islsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=iAsparagine, 
P^Proline, Q=:Glutamine, R=Arginine, 
S=Serine, T=Threonine, V*Valinc, 
M=Tryptophan, Tyrosine, X^^Unknown, *=Stop 
Codon, /^possible nucleotide deleLion, 
\=possible nucleotide insertion) 








GHLTRTKLLAS KAHGS EPPS VPQIjPDQSSP VPRVPS P I AQATGS 
PVQEEAI VI IQSALRAHLARAEHSATGKRTTTAASTRRRSASAT 
HGDASSP PFIAALPDPS PSGPQAVAPLPGDDVNSDDSDD I VIAP 
SLPTKNPPV 


B494 


71 


536 


RSKAKIGTPTRBVPSTDMKVKRESSSStiTHRPAPSPATPRLLGT ' 
RRVI>LGVSEGTGC7U:AMELVLVFLCSLLAPMVI*ASAAEKEKEKD 

pfhydyqtlrxgglvfawlfsvgiiililsrrckcsfnqkprap 
gdeeaqvenlitanatepqkaen 


5495 


273 


2168 


DSLLLIQVDTMPFTLHLRSRLPSAIRSLILQKKPNIRNTSSMAG ■ 

ELRPASLWLPRSLAPAPERPCQVNTGPLPIiLGQSEPEKWMLPP 

OGAlSETRMGHPOFWKYEFGACTOSIiASLEQySBQLKDWVAFFL 

GCS FS LBEALEKAGLPRRDPAGHSQAGAYKTTVPCVTHAGFCXrP 

liWTMRP IPKDKLEGLVRACCSLGGEQGQPVHMGDPELLGI KEL 

SKPAYGDAMVCPPGEVP VFVJPSPIiTSLGAVSSCETPIAFAS I PG 

CTVMTDLKDAKAPPGCLTPERIPEVHHISQDPLHYSIASVSASQ 

KIRELESMIGIDPGNRG IGHIXCKDELLKASLSLSHARS VLI TT 

GFPTHFNHEPPEETDGPPGAVALVAFLQAI»EKEVAIIVDQRAWK 

LHQKIVEDAVEQGVLKTQIPILTYQGGSVEAAQAPLCKNGDPQT 

PRFDHLVAIERAGRAADGNyyNARKMNlKHLVDP IDDLFIAAKK 

IPGISSTGVGDGGNELGMGKVKEAVRRHIRHGDVIACDVEADFA 

VIAGVSNWGGYAIiACALYlLYSCAVHSQYLRKAVGPSRAPGDQA 

HTQALPSVIKEBKMLGILVQKKVRSGVSGIVGMEVDGLPFHNTH 

AEMIQKIiVDVTTAQV 


S496 


3 


2408 


QDTKMHEXYKGNITPQLNKWTLKTSAATDVWAVYFSQFWIDY3G 
MKSGRGRPISFVDSFPXjS IWICQPTRYAESQKEPQTCNQVSLNT 
SQSJSSSJDLAGRIiKRKKLLKEYYSTESEPLTNGGQKPSSSDTFFR 
PSPSSSEADIHLLVHVHKHVSMQINHYQYLLLIiFIC'IESIilLI.SE 
NURKDVEAVTGS PASQTS ICIGILt,RSABl4ALIiI.HPVDQANTL.K 
SPVSESVSPWPDYIiPTENGDFIiSSKRKQISRBINRIRSVTVNH 
MSD^-RSMSVDI^H1PLKDPLLFKSASDTNLQKGISPMDYLSDKH 
LGKISEDESSGI.VYKSGSGEIGSETSDKKDSFYTDSSSVLNyR3 
DSNILSFDSDGNQNILSSTLTSKGNETlESIFKABDLIiPEAASL 
SENLD 1 S KEBTPPVRTLKSQSSLSGKPKERCPPNXtAPLiCVSYKN 
MKRSSSQMSXiDTISLDSMlLEEQLLESDGSDSHMFLEKGNKKNS 
TWJYRGTAESVNAGANr/QNYGETSPnAI STNSEGAQENHDDLMS 
VWFKITGVNGEID IRGEDTE ICLQVNQVl'PDQLGN I SLRHYLC 
NRPVGSDQKAVIHSKSSPEISIiRFESGPGAVIHSLXAEKNGPLQ 
CHIKNFSTEFLTSSIJ*3NIQHFI,EDETVATVMPMKIQVSNTKIKL 
KDDSPRSSTVSLEPAPVTVHIDHI>VVERSDDGSFHIRDSHMLNT 
GNDLKENVKSDSVLLTSGKYDliKKQRSVTQATQTSPGVPWPSQS 
ANFPEFSFDPTREQLMB HKESIiKQELAKAKMAttAEAHIiBKDALZj 
HHIKKMTVE 


5497 


1821 


3308 


SISKLLKRRSJTIDAYIiSNSCAFFAPRIiFSl*ASQIIREQQSPNV 
CPIifKYSGFPSriECQCHFVSPHSSCYIWFFSFPPPFFVCFQLSN 
GFSHYSLSSESHVGPTGAGLFPHCLPASRLLPRVTSVHLPDYAH 
YYTIGPGMFPSSQ I PSWKDKAKPGPYDQPLVNTLQRRKEKREPD 
PWGGGPTTASGPPAAAEEAQRPRSMTVSAATRPGEEMEACEELA 
XiAliSRGLOIjDT'ORS SRDSLOCS S GYSTOTTTPPr'*^ PriT T PQnVQ 

DYDYFSVSGDQEADQQEFDKSSTIPRNSDISQSYRRMFQAKRPA 
STAGLPTTLGPAMVTPGVATIRRTPSTKPSVRRGTIGAGPIPIK 
TPVIPVKTPTVpDIiPGVLPAPPDGPEERGEHSPESPSVGEGPQG 
VTSMPSSMWSGQASVNPPLPGPKPSIPEEHRQAIPBSEAEDQER 
EPPSATVSPGOIPESDPADLSPRDTPQGEDMLMAIRRGVKLKKT 
TTMDRSAPRFS 


5498 


2434 


1492 


ilthqei ftgekpcecgkasiqmshlsqqki ysgenpfackvcg 
kvfshksnltehehfhtre kpfecmeogkafsqkqyvi khqnth 
tgektifbcastecgksfsqkenllthqkihtgekpfeckdcgkafi 
qksnlirhqrthtgekpfvckecgktpsgkskltehekihigek 
pfkcsecgtafgqkkylikhqnihtgekpyecweogkafsqrts 
livhvrihsgdkpyecnvcgkafsqsssi.tvhvrshtgekpyg3 

NECGKRFSQPSTLALHIjRIHTGKKPYQCSECGKAFSQKSHHIRH 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
correspond! ng 
to first 
amino acid 
residue of 
amino acid 
secfuence 


Amano acid segment containing signal peptidg— 
(A-Alanine, C=Cysteine, D^Aspartic Acid, 
Glutamic Acid, Phenylalanine, G-Glycine, 
H=Histidine, I^=Isoleucine, K»Lysine, 
L^Leucine, M=Methionine , N^^Asparagine , 
P^Proline, Q^nGlutamine, R^Arginine, 
S«Serine, T^Threonine, V^Valine, 
W-Tryptophan, Y=Tyrosine, X=Unknown, ♦«Stop 
Codon, /=possible nucleotide deletion, ' 
\=possible nucleotide insertion) 
QKIHTS 


5499 


324 


526 


GFGQXGRGHKi:?TyPFSPRKSGRKGMAQSgCiWVKRyiKAFCK6F 
FVAVT>VAVTFLORVACVARVEGASMQ?SLNPGGSQSSDVVL]:iNH 
WKVRNPEVHRGDI VSLVS PKNPEOKI IKRVIALEGDI V1?TTRHV 

NRYVKVPRGHIWVEGDHHGHSFDSNSFGPVSLGLLHAHATHILW 
PPERWQKLESVLPPERItPVQREEE 


S50O 


1978 


1286 


KPDWRLQNLPPRLyLWRSSRFGFGHLKKRI^MDPKIEHTWDGFP 
VKHBPVFIRLKPGDRGVMMDrSTVPPPRDPPAPLGEPGKPFNELW 
DYEWEAFFLKDITEQYLEVELCPHGQHLVLI,I,SORRNVWKQEL 
PLSFRVSRGETKWEGKAYLPWSYFPPNVTKFNSFAIHGSKDKRS 

DIiWLIEKCDI 


5501 


2927 


2226 


CRPPVSARVAPGHQGAVGGSGRRPARVEWDAAARPSSRPFStiP 
AAIMXJU^ISRLLDWFRSLFWKEEMEIiTLVGLQYSGKTTFVNVtA 

SGOPSEDMI PTVGPMMPTCVT V"r2M\/T»T VTiitnTrsri/^ririTm**!*! 
•-"-•X* «j.irx vv?r v*i u\r^\ i A.umv 1 XJvJLWiJXldOyPRFRSMWERY 

CRGVNAIVYMlDAADREKlEASRNEIiHmiIiDKPQLQGIPVI,VLG 

NKRDIiPNAIiDEKOLIElCMNLSAIQDREICCYSISCKEKDNIDIT 
LQWLIQHSKSRRS 


5502 


3 


824 


NSAFPVWVPERTALLTCPLGAAPGSSREAPGIAGPPNSTAFf^lor" 

GKFFKGGGSSKSRAAPSPQEALVRLRETEEMLGKKQEYLENRIQ 

REIALAKKHGTQNKRAALQALKRKKRFEKQIiTQIDOTliSTIEPQ 

REALEKSHTNTEVLRNMGFAAKAMKSVHENMDIJJKIDDLMQEIT 

EQQDIAQE I SEAFSQRVG PGDDFDBDEI*MAELEEIiEOEELNKKM 

TNIRLPNVPSSSLPAQPNRKPGMSSTARRSRAASSQRAEEEDDD 

IKQIAAMAT 


5503 


216. 


654 


KGVRRRGRVRSDSEDSHl^YFKMSFLLfe'Ki.TSKKEVDQAIKSTA 
EKVLVLRPGRDEDPVCLQLDDILSKTSSDLSKMAAIYLVDVr>QT 
AVYTQYFDISYIPSTVFFFNGQHMKVDYGGEDPALRSIKAVRRT 
SPAGTIiGEKPVKS ^ 


S504 


56 


3563 


QLSFSFQAPVTFDDITVYLlOEEWVLI^SQQQKELCGSiiKLW 

GPTVTVNPELFRkFGRGPEPWLGSVOGQRSLIiEHHPGKKOMGYMG 

EMEVQGPTRESGQSLPPQKKAYIiSHIiSTGSGHIEGDWAGRNRKL 

LKPRSIQKSWFVQFPWLIMNEEQTALFCSACREyPSrRDKRSRL 

lEGYTGPFKVETLKYHAKSKAHMPCnTNAlJy^PIWAARE^ 

DPPGDVIAS PEPLFTADCP I FYPPGPI.GGFDSMAELI,PSSRAEI» 

EVPGGDGAlPmYLDCISDLRQKEITDGXHSSSDimL'mmVE 

SCIQDPSAEGLSEEVPWFEELPWFEDVAVYFTREEWGMriDKR 

QKELYRDVMRMNYELliAStjGPAAAKPDLISKLERRAAPWIKDPII 

GPKWGKGRPPGNKKMVAVREADTQASAAtJSAIiM'GSPVEARASC 

CSSSICEEGDGPRRIKRTYRPRSIQRSV^PGQFPWI.VIDPKETKI* 

FCSACXERPNLHDKSSRLVRGyTGPPKVETi:iKYHEVSKAHRLCV 

NTVEIKEDTPHTALVPEISSDLMANMEHFFNAAySIAYHSRPI*H 

DFEKILQLLQSTCTVILGKYRNRTACTQPIKYISETLKREILBD 

VRNSPCVSVLr^DSSTDASEQACVGryiRYFKQMEVKESYITLAP 

LYSElTiDGYFETIVSAliDELDIPFRKPGWWGLGTDGSAMLSCR 

GGIiVEKFQEVIPQLLPVHCVAHRLfOJ^WDACGSlDLVKKCDRH 

IRTVFKFyOSSNTKRLWELQEGAAPLSQBlIRLKDLNAVRWVASR 

RRTiaALLVSWPAIARHLQRVAEAGGCJIGHRAKGKLKIiMRGFHF 

VKPCHFLUDFLSIYRPLSEVCQKEIVLlTEVNATLGaAyVALES 

LRHQAGPKEEEFKASPKDGRX*HGICI*DKLEVAEQRFQAr>RERTV 

LTG lEYLQQRFDADRP PQLKWME VFDTMAWPSGlEIiAS FGNDDI 

LNIiARYFECSLPlOTSEEALLEEWLGUCTIAQHLPFSMLCKNAI. 

AQHCRFPLLSKl^VWCVPISTSCCERGFKAMNRlRiaJERTKL 

SNEVLNMLMMTAVNGVAVTEyDPQPAIQHWYLTSSGRRPSHVYT 

CAQVPARSPASARLRKEEMGALyVEEPRTQKPPtLPSREAAEVI. 

KDCIMEPPERLLYPHTSQEAPGMS 


5505 


3312 


1219 


NCSPRSLSAAKI4SNRNNNKLPSWLP0liQNLIKRDPPAYIEEFLQ ' 
QyNHYKSNVElFKLQPNKPSKELAELVWFMAQrSHCyPEYIiSNF 
PQEVKDLLSCNHTVLDPDLRMTFCKALILbRNKNLINPSSLI^EK 
FPEIiFRCm>KIaLRKTLYTHIVTDIKNINAKHKlWKVNVVlX2NF^ ( 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acjid segment containing signal peptide 
(A=.Alanine, C^Cysteine, Cb^Aspartic Acid, E=» 
Glutamic Acid, F=Phenylalanine, G-Glycine, 
H=Histidine, I-Isoleucine, K=i Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=>Proline, Q»Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
tf=Tryptophan, Y=Tyrcsine, X=Unknown, *=stop 

CodOn, /-OOSaiblf* rn»r'le»rvt--i.4ia .4Aln.^-I»_ * 

— ^* / f ^uscnAiuiAc iiut. jLeot loe ciexetion, 
\=possible nucleotide insertion) 






YTMLRDSNATAAKMSIiDVMIELYRRNIWNDAKlVNVITTACFSK 
VTKrLVAALTPFU?ICDE02KQDSDSESEDDGPrARDriLVQyA^G 
KKSS KNKKKIiEKAMKVLKKHRKKKKPEVFNFSAIKIiIHDPQDFA 
EKLLKQLECCKERFEVKMMLMNLtSRLVGIHELFLFNFYPFLQR 
FLQPHQREVTKIU/FAAQASHHLVPPEIIQSIjLMTVANMFVTDK 
NSGEVMTVGXNAIKEITARCPIAMTEEXLQDIAQYKTHKDKNVM 
MSARTLrHLFRTLMPQMLQKKFRGKPTEASIBARVQEYGELDAK 
DYIPGAEVLEVSKEENAENDEDGWESrSLSEEEDADGEWIDVQH 
SSDEEQQEISKKLNSMPMEERKAKAAAISTSRVLTQEDFQKIRM 
AQMRKEIinAAPGKSQKRKYrEIDSDEBPRGELLSLRDIBRLHKK 
Ftt.bUJtb-XTUjATAMAGKTDREOEFVRKKrKTNPFSSSTNKEKKKQK 
NFMHMRYSQNVRSKNKRSFREKQLALRDALLKKKKRMK 


5506 JL 

1 


1531 


BRGDIiCGUKUGSAPGEGGSSAWPAPAHPliPEREREREALCPGRS 
CSGGGGEETPGTTPVWSPXiEGGGDEELRPNPYVRFPYRWWAWV 
IJUVFPSLGAGGETPEAPPESWTQLWFFRPVVNAAGYASFrWPGY 
LLVQYFRRKNYLBTGRGLCFPLVKACVFGNEPKASDBVPIiAPRT 
EAAETTPMWQAIiKLLPCATGLQVSYLTWGVLQERVMTRSYGATA 
TSPGERFTDSQFLVLMNRVLAi:iIVAGl^CVI.CKQPRHGAPMYRY 
SFASLSNVLSSWCQYEALKFVSFPTQVLAKASKVIPVMLMGKLV 
SRRSYEHWEYliTATLISIGVSMFI*LSSGPEPRSSPATTI.SGt,lL 
IAGyiAFDSFTSNWQDALFAyKMSSVOMMFGVNKFSCI.FrVGSI, 
LEQGALLBGTRFMGRHSEFAAHALIiLS ICSACX3QLFIFYTIGQF 
GAAVFTIlMTLRQAFAILLSCIiLYGHTVTWGGLGVAWPAALL 
LRVYARGRIiKQRGKKAVPVESPVQKV 


5S07 3704 


1271 


PRGTRRCRPAGRASRRARRRPPCPGPAAPGSLEIGGFGTAAGKK^ 
VAVADVQPGPMRFHQDQLQi/UCVFTKEDNQCNGFCRACEKAGFK 
CTVTKEAQAVIiACFLDKHHDl 1 1 IDHRNPRQLDASTVIiCRS IRSS 
KJ.SENTVIVGWRRVDREEX.SVMPFISAGFTRRYVENPNIMACY 
NEJULQJOEFXSEVRSQLKLRACiVSVFTALENSEOAIHITSEDRPlQ 
i>UNir'Aj?i:.XXTiGyQSGEIjIGKELGBVPINEiaCADLLDTINSClRI 
GKEWQGI YYAKKKKr^ONIQQMVKI I PVIGQGGKIRHYVSl IRVC 
NGNNKAEKISBCVQSDTHTDNQTGKHKDRRKGSLDVKAVASRAT 
EVSSQRRHSSMARXHSMTI EAPITKVINl IKAAQESSPMPVTEA 

NEYVIjSTKNTQMVSSNI ITP I SLDDVPPRIARAMENEEYWDPDI 
FELEAATHNRPLIYLGLKMFARPGICEFmCSESTLRSKLQtlE 
ANYHSflNPYHK'STHSTUDVLHATAYFLSKERIKETbDPIDEVAAL 
lAATIHDVDHPGRTNSFLCNAGSEIAlIiYNDTAVLESHHAALAP 
QIiTTGDDKCNIFKWMBRWDYRTLRQGIXDMVriATEM'r'KHPFTTVNr 
KPVNSINKPIiATLEENGETDKNQEVINTMXJlTPENRXIilKRMIiI 
KCADVSNPCRPLQYClEWAARISEEYFSQTDEEKQQGIiPWMPV 
FDRNTCSIPKSQISFIDYFITDMFDAWDAFVDLPDLMQHLDJJNF 
KYWKGLDEMKLRWIiRPPPB 


5508 1151 




LSSVFSRRSASMPAVGCSMGPFLHYWYLSLDRLFPASGIiRGPPN 
VL!CICVI.VDQLVASPXJ.GVWyFlX3LGCX.BGQT7GESC<?EIJlEKF^ 
EFYKADWCVWPAAQFVNFLFVPPQFRVTYIMGIiTLGWDTYLSYL 
KYRSPVPLTPPGCVALDTRAD 


5509 1238 


619 


RKSRGCQNALSASGPAAAAAAIMVRKLKFHEQKtLKQVD^LJlWE 
VTDHNLHELRVLRRYRLQRREDyTRYNQl.SRAVREIJVRRLRDLP 
ERDQFRVRASAALLDKLYALGLVPTRGSLELCDPVTASSFCRRR 
LPTVLLKLRMAQHLQAAVAFVEQGHVRVGPDWmJPAFIiVTRSM 
EDFVTWVDSSKIKRHVLEYNEERDDFDLEA 


5510 96 " 


1155 

1 
< 

J 


PAGAHZiSSGSSEPLVEPGRGRVGARVKGERGLQASGSAPGRSKM 
AEGERQPPPDSSEEAPPATQNFIIPKKEIHTVPDMGKWF^SQAY 
ADYIGFXLTLNEGVKGKKLTFEYRVSEAIEKLVALLNTLDRWID 
ETPPVDQPSRFGNKAYRTWYAKIJ^EEAENLVATVVPTHLAAAVP 
EVAVYLKESVGNSTRIDYGTGHEAAFAAFLCCLCKIGVLRVDDQ 
lAI VFKVFNRyi,BVMRKI,QKTYRMEPAGSQGVWGi:iDDFQFr.P Fl 
WGSSaLlDHPYLEPRHFVDEKAVNENHKDYMFIiEClLFlTEMKT 
SPFAEHSNQLWNISAVPSWSKVNQGIilRMYKABCIiEKFPVXQHif 
KFGSLLPIHPVTSG 
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SEQ 
ID 
MO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
reeicJue of 
amino acid 
sequence 



■Amino acid segment containing signal -pSptlH^ 
(A^Alanzne, C^Cysteine, D-Aspartic Acid E= 
S^"^^!'!-^.^^^^' ^^Phenylalanine, G^^GlyciAe, 
H=.Histidine, I = Isoleucine, K=Lysine, 
L==Leucine, M=Methionxne, W«Asparagine, 
P=Proline, Q»Glutaraine, R^Arginine 
S-Serine, TxxThreonine, V^Valine, ' 
W=Tryptophan, V=Tyrosine, X=Unkiown. *«StoD 
Codon, /^possible nucleotide deletion " 
\=»possible nucleotide insertion) 




5513 



55X4 



12SS 



637 



Ut'^bI.LTJTV7T:;VTVI.VLV LKjjMWSRRREPlTl.gU^>KAKyPLPI. 
lEKBKISHKTRRFRFGLPSPDHVLGLPVGNYVQUAKIDNELW 
RAYTPVSSDDDRGPVDLIIKIYPiQJVHPQYPEGaKMTQyi»ENMK 
IGETIFFRGPRGRLFyirGPGNLGIRPDQTSSPKKTrADHIiGMIA 
GGTGlTPMLQLIRKITKDPSDRTRMSLIFANQTEEDILVRKEr.E 
BIARTHPDOFDLWYTUDRPPIGWKySSGFVTADMlKEHLPPPAK 
STLILVCGPPPLIQTAAHPNIiEKLGYTQDMlFTy 



449 



55X5 



1572 



SS16 



246 



735 



499 
13 7S 



AKWRI.PSDSPKiPPAGA£T PGRGSCRNYX>PiiSSPPPPEPSSFP£r 

PPTSRGGPGSRDIWSDSEEESQDRQLKIVVLGDGASGKTSLTTC 

FAQETFGKQYKQTIGLDFFLRRITLPGNIJIVTLQIWDIGGQTIG 

GKMLDKyiYGAQGVLLVYDITKYQSFENLEDWrrWKKVSEESE 

TQPLVALVGNKIDLEHMRTIKPEKHIiRFOQENGPSSHFVSAKTG 

DSVPLCFQKVAAEILGIKLNKAEIEOSQRWKADIVNYWQEPMS 
RTVNPPRSSMCAVQ w«.«ri«, 



VKRPSWIMGNKRGHAI.PGTFb'J.Xl<^l,WWCi'KSILKYICKK(5l^ 
Cn^LGSKTLFYR:,EILEGITIVG^4ALTGMAGEOFIPGCPHLMr,3fr> 
YKOGHWNQLLGl^HHPTMYFFFGLLGVADILCFTISSLPVSLTKL 
MLSNALFVBAPIPYNHTHGI^MLDIFVHQrJiVLVVFLTGLVAFE, 
EPLVRN^LELLRSSLILriQGSWFFQXGFVLYPPSGGPAWDt*! 

DHEKILFLTICFCWHYAVTIVXVGMHrYAFXTWLVKSRLKRLCSS 
EVGLLIQTAEREQESEEEM 

WKLVGRGDCDPl,LSVCIin'MPLYEGLGS QGEKTAWIDX.GEAP " 
TKCGPAGETGPRCriPSVIKRAGMPKPVRVVQYNINTEELYSYL 
KEFIHILYFRHLLVNPRDRRWIXBSVLCPSHFRETLTRVLFKY 
PEVPSVIJ^SmI^^ALI»TLGINSA^miDCaYRBSLVI,PlYEGIP 

vx^cwgalplggkalhkei^qij:,eqctvdtsvakeqslpsvmg 

5VPEGVLEDIKARrCFVSDLKRGLKI0AAKFWXDGNNERPSPPP 
KVDYPUJGEKIIiHILGSIRDSWEILFEQDWEBQSVATLILDSI. 
IQCPIDTRKQLAENtiWiGGTSMLPGFLHRUAEIRYLVEKPKY 
WCAIX5TKTFRIHTPPAKANCVAWLGGAIFGALQDI3LGSRSVSKE 
YYNQTQRIPDWCSUINPPLEMMFDVGKTQPPLMKRAPgTElf 



WSKiiPPPAGPGi^aPRKSPTA SijFLFPWRPIASSFWWGAQGAQE r 
IKAMWRVPGTTRRP^^•GESPGMHRPEAMI,LU:,TLALLGGPTHAG 
KMYGPGGGKYFSTTEDYDHEXlX3LRVSVGU:,r,VKSVQV!CLGDSW 
DVICLGALGGimiEVTLQPGEYITKVFVAFQAFLRGMVIOTSKDR 
YFYFGKbDGQISSAYPSQEGQVLVGlYGQYQLLGIKSXGFEWNY 
PLEEPTTEPPVNLTYSANSPVQR 



SErYVAlvU^TU3^KMTDVESGV ANFASSARAGRRNAt. - pDIQSSAA - 
TDGTSDLPLKLBALSVKEDAKEKDEKTTQDQLEKPONEBK 



UAWADAWVRAWDLWMDFPCX.WLGLLLPLVA ALDFNYHRQBGMEA 
FLKTVAQNYSSVTHLHSIGKSVKGRNLWVLWGRFPKEHRIGIP 
EFKYVAMMHGDETVGRELLLHLIDYI^VTSDGKDPEITIILXNSTR 
XHIMPSMNPDGPEAVKKPDCYYSIGRENYWQYDLNRNPPDAFEY 
^SRQPBTVAVMKWhKTBTFVhSRNmCSOALVASYPFDmVQA 
TGALYSRSLTPDDDVTQYLAHT^ASRNPNMKKGDECKNICMNFPN 
GVTKrGYSWYPLQGGMQDYNYIWAQCFElTI,El^CCKyPREEKLP 
SFt^NNmCASIiI EYI KQ VHLGVKGQVFDQNGMPLPNVI VEVQDR^ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acxd segment containing signal peptide 
(A=Aianine, C=Cysteine. D=:Aspartic Acid, 
Glutamic Acid, F^^ Phenylalanine, G=<Jlycine, 
H«=Histidine, l=Isoleucine, K-Lysine, 
LsLeucine, M=Methionine, N^Asparagine, 
P*=Proline, Q=:Glutaniine, R^^Arginine, 
Ss^Serine, T^Threonine, V=Valine, 
W^Tryptophan, Y-Tyrosine, X-Unknown, *«=Stop 
Codon, /^possible nucleotide deletion, ' 
\«=possible nucleotide insertion) 


5S19 






HICPyRTNKrGEyYLIiLEiKJS Y 1 INVTVPGHDPHl^f KVI IPEKS 

QNFSALKKDILLPFQGQt.DSlPVSN&SCPMIPLyRNLPDHSA^T 
KPSr.FLFLVSLLHIFFK 


5520 


37 


477 


IKSiCLNQQVEVQESEWRLTEAKGPTMGKESGWDSGRAAVAAWG 
GWAVGTVLVALSAMGFTSVGIAASSIAAKMMSTAAIANGGGVA 
AGSLVAILQSVGAAGIiSVTSKVIGGFAGTALGAWLGSPPSS 




117 


943 


PTEGRQKVIiKTFTVPRSALAMTKTSTCIYHPLVl^SWyTFLNYYI 
SQEGKDEVKPKILANGARWKYMTLIiMr.r.T.nT'TPWM/n'/^T T\T\T>r w 

RTKGGKDIKPLTAFRDIJ.FTTLAFPVSTFVFLAPWILFI,YNRDL 
iyPKVLDTVr?VWI.NHAMHTPIPPITLAEVVLRPHSypSiaaV3I. 
TLIAAASIAYISRILWLYFETGTWVYPVFAKLStiLGIAAFFSLS 
YVFrASlYLW3EKI^HWKWVSVQILQRWRLESVGICFQMPDWKS 
PAKHQLVKUIR 


5521 
&522 


546 


9X1 


KI JjNMQKSCEENEGKPQNM pkaeedrpledvp'qeaegi^pqpsee " 

GVSQEAEGNPRGGPNQPGOGFKEDTPVRHLDPEBMIRGVDELER 

LREEIRRVRNKFVMMHWKQRHSRSRPYPVCFRP 


5523 


1224 


SJ7 


GSRPLGQRSREKMWVFGYGSLIWKVDFPYQDKI.VQYfTkYSRRF 
1 uwKijf V i'UftJ:'ORvVTljVEDPAGCVWGVAYRliPVGKEEB\^ 
AYLDFREKGGVHrTrVIFYPKDPTOCPF^VLLyXGTCDWPDYIiG 
PAPLEDXAEQ t FNAAG PSGRNTEYliFE tiANSIRNLVPEEADEHI. 
PALEKLVKERLEGKQKMJCI 




3 


1280 


SKGKKRMGSSMSAATARRPVFDDKEDVNFDHFQILRAIGKGSFG 

KVCIVQKRDTE;<MYAMKYt4NKQQClERDEVRKVPRELEII^EIE 

HVFLVNLWYSFQDEEDMPtWVDLLLGGOLRYHLQQNVQPSEDTV 

RLYICJSMALALDYLRGQHI IHRDVKPDJJILLI>ERGHAHLTDFNI 

ATIIXDGERATALSGTKPYMAPEIFHSFVKGGTCYSFBVDWWSV 

GVMAYELLRGWRPYDIHSSNAVESLVQIiFSTVSVQYVPTWSKEM 

VM'LRKLhTmPEHRhSSU^DVQAA^AhAOVhVJDHLSBKRVEPG 

FVPNKGRLHCDPTFELEEMIl^SRPLHiCKKKRIAKNKSRDWSRD ^ 

SSQSENDYLQDCIiDAIQQDPVIFNREKLKRSQDLPREPLPAPES 

RDAAEPVEDEAERSALPMCGPICPSAGSG 


SS24 
552S 


es 


2318 


RERERDHRPGESSQGQSGAGGCFPSPTMEIiRCGGLLFSSRF^S^ 
KItAHVEKVESLSSDGEGVGGGASALTSGIASSPDYEPNVWTRPD 
CAETEFENGNRSWFYFSVEGGMPGKLIKINIMNMNKQSK1.YSQS 
MAPFVRTLPTRPR WER I RDR PTFEMTETOFl/LS PV3P mran-oriTi. 

TTFFAFCYPPSYSDCQELLNQLDQRPPENHPTHSSPLDTIYYHR 
BLLCYSLDGLRVDLr.TXTSCHGI.REDRBPRr.EQi:.PPDTSTPRPF 
RFAGKRIPFLSSRVHPGETPSSFVFNGFUiFILRPDDPRAQTia 
RLFVFKLIPMLNPDGWRGHYRTDSRGVNLNRQYLKPDAVIiHPA 
IYGAKAVLI,YHHVHSRI,NSQSSSEHQPSSCLPPDA?VSDI,EKAM 
NLQNEAQCGHSADRHNAEAWKQTEPAEQKIiNSVWIMPQQSAGIiE 
ESAPDTIPPKESGVAYY^fDUlGHASKRGCFMYGNSFSDESTQVE 
NMLYPKLISLNSAHFDFQGCNFSEKNMYARDRRDGQSKEGSGRV 
AIYKASGIIHSYTLEGNYNTGRSVNSIPAACHDNGRASPPPPPA 
FPSRyTVELPEQVGRAMAIAAU>5?^CWPWPRIVLSEHSSI*TNL 
RAW^a.KHVHNSRGI*SSTLNVGVNKKRGI*RTPPKSHNGLPVSCSB 
NTLSRARSFSTGTSAGGSSSSQQNSPQMKNSPSFPFHGSRPAGL 
PGI^SSTQKVTHRVIiGPVRGKPVWEPr.QHV¥X3CLGHCWGK 


5526 


105 


834 


SNTLDFERHLvjMUQQISDQTQLVINKI.PEKVAKHVTLVRESGr" 

LTYEEFLGRVAELNDVTAKVASGQEKHLLFEVQPGSDSSAFWKV 

WRWCTKINKSSGIVEASRIMNLYQFIOLYKDITSQAAGVLAQ 

SSTSEEPDENSSSVTSCQASLWMGRVKQLTDEEECCrCMDGRAD 

LXLPCAHSFCQKCIDKWSDRHRNCPICRLQMTOaNESWWSDAP 

rEDDMAKYILNMADEAGQPHRP 




3 


853 J 
( 

] 
] 
J 


RRPCNPVRAAKRTGAAARAPRGLEVTMI.RVAWRTl.SI,iRTRAVT 
3VLVPGLPGGGSAKPPFNQWGLQPRSLULQAARGYVVRKPAQSR 
tiDDDPPPSTI^KDYQm^IEKVDDWKIiLItSLEMAiaKKBMLKI 
KQEQFMKKIVANPEDTRSLEARI lALSVKlRSYEEHLEKIlRKDK 
\HKRyr«LMSlDQRKKMLKNliRWTNYDVPEKICWaiiGIEYTPpPL 
naiRAHRRFVTKKAl,ciRVFQETQia.KKRRRALKAAAAAQKQAi 
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SEQ 
ID 
NO: 


Predicted 
be9 inning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Pretiicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acxd segment containing signal peptide 
(A= Alanine, C^Cysteine, P=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I«Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S=:Serine, T=Threonine, V=Valine, 
M=:rryptophan, Y^Tyrosine, X=^0'nfcnown, *=Stop 
Codon, /-possible nucleotide deletion, ■ 
\*possible nucleotide insertion) 
RRN?DS PAKAI PKTLKDSQ " ^ 


SS27 


3225 


565 


LLR :<y LLHQW PLLLRHQPNRTCI SFSATMKIiKDTKSRPKQSS CG 
KFQTKGI KWG KWKE VKIDPNMPADGQMDDLVCPEELTDYQLVS 
PAKNPSSIiFSKEAPKRKAQAVSEEEBEEEGKSSSPKKKIKLKKS 
KNVATEaTSTQKEPBVKDPELEAQGDD^^VCDDPEAQEMTSENLV 
QTAPKKKKNKGKKGLEPSQSTAAKVPKKAKTWI PEVHDQKADVS 
AMKDLFVPRP VJUiALS PLGFS APTP iQALTIiAPAIRDKIiDlIiGA 
AETGSGKTIiAFAIPMIHAVIiQWQKRNAAPPPSNTEAPPGETRTE 
AGAETRS PGKAEAESDALPDDTVIES EALPSDIAAEARAKTCGT 
VSDQALLPGDDDAGEGPS SLIRE KPVPKQNEWEEENUiKEQTGN 
LKQEIiDDKSATCKAYPKRPLLGLVLTPTRELAVQVKQHIDAVAR 
FTGIKTAII*VGGMST0KQQRMLNRRPEIWATPGRI,WE1,IKEKH 
YHLRNLRQLRCLWDEADRMVEKGHPAELSQUiEWLNDSQVNPK 
RQTLVFSATIjTIiVHQAPAHIIiHKKHTKKMDKTAKLDIiLMQKIGM 
RGKPKVIDLTRNEATVETLTETKIHCBTDEKDFYLyypjji^^QYPG 
RSLVFANS ISCI KRLSGLLKVLniMPliTI.HAC3iaHQKQRtiRNI.BQ 
FARLEDCVLLATDVAARGLDIPKVQHVIHYQVPRTSEIYVHRSG 
RTARATNEGLSLMLIGPEDVINFKKIYKTLKKDEDIPLFPVOrK 
YMDWKERIRIARQIEKSEYRNFQACLHNSWIEQAAAALEIELE 
EDMYKGGKADQQEEURRQKQMKVLKKELRl-OJtiSQPLFTESQKTK 
yPTOSGKPPHjVSAPSKSESALSCLSKQKKKKTKKPKEPQPEQP 
QPSTSAN 


5528 


3 


835 


GPFLSACRMWGACKVKVHDSLATIS XTLRRYLRLGATMAKS KFE ' 
YVRDPEADDTCLAHCWVWRLDGRNFHRFAEKHUFtAKPNDSRAL 
QLM^KCAQTVMEELEDIVIAYGQSDEYSFVFKRKTNWFKRRASK 
FMTHVASQFASSYVFYWRDYFEDQPLLYPPGFDGRVWYPSNQT 
LlOJYLSWRQADCHrNNLYKTVFWALrQQSOLTPVQAQGRLQGTL 
AADKiNEILFSEFNlNYNWEPPMYRKGTVLXWQKVDBVMTKEIKL 
PTEMEGKKMAVTRTRTKPCKPSHLPRAPCLRWL { 


5525 


48 


640 


TFRLVSAHLKTRKLIWrPEAAERRWRDWDSRQGWLSVKMQRVSGL " ' 

LSWTLSRVLWLSGLSBPGAARQPRIMEEKALEVYDLIRTIRDPE 

KPNTLEELEWSESCVEVQEINEEEYLVIIRFTPTVPHCSLATL 

IGLCLRVKLQRCLPPKHKLEIYISBGTHSTEEDINKQINDKERV 

AAAMENPMLREIVEQCVLEPD 


5S30 


454 X 


2606 


AQIVHAISYCHKLHVGHRDLfCPENWFFEKQGLVKLTDFGFSNK 
FOPGKKLrrSCGSLAYSAPEILLGDEYDAPAVDIWSLGVILFML 
VCGQPPFQEANDSETLTMIMDCKYTVPSHVSKECKDLITRMLQR 
DPKRRASLEEIENHPWLQGVDPSPATKYNIPLVSYKNLSEEEHN 
SIIQRMVLGDIADRDAIVEALETNRYKHITATYFLLAERILREK 
Wiijuixy iKoAtoi-'&KXKAttr KwSWPTKZDVPQDLEDDLTATPLSH 
ATVPQS PARAADSVLNGHRSKGLC3>SAKKDDLPELAGPALSTVP 
PASLKPTASGRKCLFRVEEDEEEDEEDKKPMSLSTQWLRRKPS 
VTNKLTSRKSAPVLNQIFEEGESDDEFDMDEWLPPKLSRLKPItn: 
ASPGTVHKRYHRRKSQGRGSSCSSSETSDDDSESRRRLDKDSGF 
TYSWHRRDSSEGPPGSEGDGGGQSKPSNASGGVDKASPSENNAG 
GGSPSSGSGGNPTNTSGTTRRCAGPSKSMQriASRSAGELVESLK 
LMSLGLGSQLHGSTKYI IDPQNGLS FSS VKVQEKSTWKMCI SST 
GNftGQVPAVGGIKFFSDHMADTTTELERIKSKNLKNNVLQLPIiC 
EiCTISVNIQRNPKEGLLCASSPASCCHVI 


5531 


24 


515 


GSOPRAPRPRDSMERPEPELIRQSWRAVSRSpLEHGrVLFARLF 
ALEPDLLPLFQyNCRQFSSPEncr.SSPEFLDHIRKVMLVIDAAV 
TNVEDLSSLEEYXASLGRKHRAVGVKLSSFSTVGESLLYMLEKC 
LGPAPTPATRAAWSQLYGAWQAMSRGWIXSB 


5532 


3395 


1402 


SDWKVVGKRKMIIEDETEFCGEELLHSVLQCKSVFDVLDGEEMR " 

RARTRAWPYEKIRGVPFLNRAAMKMANMOPVFDRMFTNPRDSYG 

KPLVKDRBABLLYFADVCAGPGGFSEYVLWRKKlfHAKGFGMTLK: 

GPNDFKLEDFYSASSELFEPYYGEGGIDGDGDITRPEWISAFRN 

FVLDNTDRKGVHFLMADGGFSVEGQENLQEILSKQLLLCQFLMA 

LSIVRTGGHFXCKTFDLFTPFSVGi:,VYLLYCGFERVCa:,FKPlTS 

RPANSERYWCKGLKVGIDDVRDYLFAVNI KLNQLRNTDSDVlfj 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

c o rre spending 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo ac2d segment containing signal peptide 
(A=Alanine. C^Cysteine, D=sAspartic Acid, E=. 
Glutamic Acid, P=Phenylalanine. G-Glycine, 
HsHistidine, I«=Isoleucine, K-Lyaine, 
ti=Leucine, M=Methionine, N=iAeparagine, 
P-Proline, Q=Glutamine, R=.Arginine, 
S ^Serine, T=Threonine, V=Valxne, 
W=Tryptophan, Y^Tyrosine, X=Unknovm, *=;Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 






SEPRQAEIRKECLRLWGIPDQARVAPSSSDPJCSKPPELIQGTEZ 
DIPSYKPTLI^TSKTLEKIRPVFDyRCMVSGSEQKFLIGLGKSQI 
YTWDGRQSDRWIKt>DLKTEI.PRDTLLSVEZVHELKGEGKAQRkl 
SAlHIIiDVLVUTGTDVREQHFNQRIQIAEKFVKAVSKPSRPDMN 
PIRVKEVYRbEEMEKIFVRLBMKIIKGSSGTPKLSYTGRDDRHP 
VPMGLYIVRTVNEPWTMGPSKSPKKKFFYNKKTKDSTFDLPADS 
^^ICYYGRLPWEWODGIRVHDSQKPQDQDiaSKEDVLSFIQ 


5533 
"5534 


94 


789 MKERRAPQPWARCKLVLVGDVQCGKTAMLQVIAKDCYPETYVP 

TVFENYTACLETEEQRVELSEiWDTSGSPYYOWVRPLCYSDSDAV 

IXCFDISRPETVDSALKKWRTEILDYCPSTRVI,I*IGCKTDLRTD 

LSTLMELSHQKQAPISYEQGCAIAKQLGPEIYLEGSAFTSEKSI 

HS1FRTASMLCLNKPSPLPQKSPVRSI^KRLLHLPSRSEI.ISPT 
PKKEKAKXCSIM 




3 


605 ^^VRGRARAANPGRVGAMDGLRQRVEHFLEQRNLVTEVLGAIiEAK 
TGVEKRYtiAAGAVTIJC^SLYliriPGYGASLLajLIGFVYPAYASIK 
AIESPSKODDTVWLTYWWYALFGIiAEFFSDLLIbSWFPFYYVGK 
CAFIiI^FCMAPRPWNGALMLYQRWRPLFLRHHGAVDRIMNDLSG 
RALDAAAGITRNVKPSQTPQPKDK 


S53S 


1029 


332 *^^«DSEARLCSX.VELSDTQDETQKSDSENEDLKIDCLQES"QEIj 
NLQKLKNrSERILTEAKQKMRELTVNIKMKEDLlKEl,IKTGKDAK 
SVSKQYTLKVTKLEHDAEQAKVELTETQKQLQEI.EKKDLSDVAM 
KVKI^QKEFRKKVDAAKI^VQVTJQKKQQDSKKIASLSIQWEKRAN 
ELEQSVDHMKYQKIQIjQRiaUQEEMEKRKQLDAVIKRDQQKIKVI 
LSYIPAKYNMKC 


5536 




282 AAATAASLSPRGCRLRTPSSDVSPSRA^PPSAAPLPTSRAQMSP 
SGRLCLLTtVGLILPTRGQTLKDTTSSSSADATlMDIQVPTRAP 
I DAVYTELQPTSPTPTWPADETPQPQTQTQQLBGTDGPLVTBPET s 
HKSTKAAHPTDDTrTLSERPSPSTDVQTDPQTl,KPSGPHEDDPF 
Fyr>EHTLRKRGr,LVAAVLFIT(3lriI,T5GKCR0LSRl,CRNHC!R 


5537 
S538 


3 


RftRVSSPQlARVFRSGRPRRLRVU^IKRTSVALRI.AGTGRPVAKt' 
PGHPGSWEMGLLTFRDVAVEPSLBEWEHLBPAQKNLYQDVMLEN 
YRNLVSUSLWSKPDLITFLEQRKEPWNVKSEBTVAIQPDVFSH 
YIJKDIiTEHCTBASFQKVISRimGSCTLENlillJlKRWKl^ 
[ HNGCYDEKTFKYDQFDESSVBSLFHQQIIiSSCAKSYNPDQVRKV 

EKNYHCNNSEKTliNQSSSPKNHQENYFI,EKQYKCKEF3EVPLQ$ 
MHGQEKQEQSYKCNKCVBVCTQSLKHIQHQTIHIRENSYSYWKY 
DKDLSQSSNIiRKQ 1 1 HNEBKP YKCEKCGDSLNHSLHLTQHQI 1 P 
TEEKPyKWKEC2KVPNrJ^CSLYXTKQQQIimJENI.YKCKACSKS 
FTRSSNLIVHQRIHTGEKPYKCKECGKAFRCSSYLTKHKRIHTG 
EKPYKCKECGKAPNRSSCXTQHQTTHTGEKLYKCKVCSKSYARS 
SNLIMHQRVHTGEKPYKCKECGKVFSRSSCbTQHRKIHTCBNLY 
KCKVCAKPFXCFSNlilVHERIHTGEKPYKCKEOGKAFPYSSHLI 
RHHRIHTGEKPYKCKACSKSFSDSSGLTVHRRTHTtSEKPYTCKE 
CXSKAFSYSSDVIQHRRIHTGQRPYKCEECXSKAFNYRSYLTTHQR 
SHTGERPYKCEECGKAFNSRSYLTTHRRRHTGERPYKCDECGKA 
FSYRSYXTTHRRSHSGERPyKCEBCGKAFMSRSYLlAHQRSirrR 
EKL 


S539 ~~ 


$2G 


3.&1 HSMMMKIPWGSIPVLMLLLLI^LIDISQAQI^CTGPPAIPGIPG 
IPGTPGPDGQPGTPGIKGBKGLPGLAGDHGEFGEKGDPGrPGNP 
GKVGPKGPMGPKGGPGAPGAPGPKGESGDYKATQKIAFSATRTI 
l^^PLRRDOTIRFDHVIT^IMNmTYEPRSGICFTCKVPGLYYFTYHA 
SSRGNLCVWLMRGRERAQKWTFCDYAYNTFQVTTGGMVLKLEQ 
GENVFLQATDKNSLtiGMEGANS I FSGFLLFPDMEA 




38 


1258 MRGPSGAAAPGCALPRGQALBGPRSCRRPQPMARRYDEIjPHyPG " 
IVDGPAAriASFPETVPAVPGPYGPHRPPQPLPPGLDSDGI.KREK 
j DSIYGHPLFPLLALVFEKCEriATCSPRDGAGAGLGTPPGGDVCS 
j SDSFNEDIAAFAKQVRSERPIiFSSNPELDNLVIQAlQVLRFHU^ 
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SEQ 
ID 
NO; 



Predicted 
heg Inning 
nucleotide 
location 
corresponding 
to first 
amino eicid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



(A=Alanxne. C=Cysteine, D^Aspartic Acidf 
Glutamic Acid, F=Phenylalaninc, G=,Glycine, 
H=Histi.dxne, I^^Isoleucine, K-Lysine 
L-Leucine, M=Methionine. N^Asparagine 
P=ProXa.ne, Q=Glutamine, R-Arginine, 
S^Serine. T=THreonine, V«Valine, 
W=Tryptophan, Y=Tyrosine, X:=Unknown, *==stop 
Codon, /^possible nucleotide deletion, 
\^po9sible nucleotide insertion) 



5540 



148 



5S41 



1440 



148 



5S43 



2405 



£544 



1895 



S14 



BLEKVii^LCOWyCHKyriCLKGK MPXDJbVIBDRDGGCREPFEDY 

PASCPSLPDQNNMWIRDHEDSGSVHLGTPGPSSGGLASQSGDStS 

SDgGDGLDTSVASPSSGGEDEDLDQER3iRNKKRGIFPKVATNIM 

RAWLFQHI^SHPYPSEEQKKQLAQDTCLTILQVWNWFINARRRIV 

QPMIDQSNRTGQGAAFSPEGQPIGGYTETQPHVAVRPPGSVGMS 
IJJLEQEWHYL 

PPLOA^aAiiVHAliHPHPARRLPL'rrAGVGGRAPDI.X^prPWRQHRG 



PSGAAAPGCALPRGQALEGPRSCRRPQPMARRYDEUPHYPGIVD 
GPAALASFPETVPAVPGPYGPHRPPQPLPPGLDSDGLKHEKDEI 
YGHPLFPLlALVFEKCEliATCSPRBGAGAGLGTPPGGDVCSSDS 
FNEDNTAFAKQVRSSRPLFSSNPELDNLMIQAlQVt.RFHI*LELE 
KGKMPIDhVIEDRDGGCRBDFBDYPASCPSLPDQmimRDHEJ) 
SGSVHLGTPGPSSGGLASQSGDNSSDQGVGUDTSVASPSSGGED 
EDLDQEPRRNKKRGrFPKVATNrMRAWLFQHLSKPYPSEEQKKQ 
LAQDTGLTILQVNNWPIKARRRIVQPMIDQSNRTGQGAAFSPEG 
QPIGGYTSTEPHVAFRAPASVGD EFGTRKEgWHYI* 
PPLGAGAGVHARSPHP ARRLPLTTAGV ' UGRAPDtiLPTPWRQH 
PSGAAAPGCALPRGQALEGPRSCRRPQPMARRYDELPHYPGI^ 
GPAAIASFPETVPAVPGPYGPHRPPQPLPPGLDSDGI.KREKDEr 
YGHPLFPLIAUVFEKCELATCSPRDGAGAGLGTPPGGDVCSSDS 
FNEDNTAFAKQVRSERPLFSSNPEbDNIiMIOAlOVLRFHrjCELE 
KGKMPIDLVIEDRDGGCREDFEDYPASCPSLPDQNNXKIRDHED 
SGSVHLGTPGPSSGGIiASQSGDNSSDQGVGliDTSVASoSSGGED 
EOI,DQEPRRNKKRGI FPKVAI^NIMRAWLFQHISHPYPSEEQKKQ 
liAQDTGIiTILQVNNWFII^aKRRIVQPMIDQSNRTGQGAABSPEG 
QPIGGYTSTHPHVAFRAPASVQDEFGTRKEE WHYL 
PPLGAGAGVHARSPHPARRLPLTTAG VGGRAPDLLPTPWRQHRQ 
PSGAAAPGCALPRGQALEGPRSCRRPQPMARRYDELPHYPGIVD 
GPAALASFPBTVPAVPGPYGPHRPPQPLPPGIiDSDGLKREKDEI 
YGHPLFPLIALVFEKCELATCSPRDGAGAGLGXPPGGDVCSSDS 
FNEDNTAFAKQVRSERPI.FSSNPELDNI.MIQAXQVI.RFHLLELE 
KGKMPIDLVXEDRDGGCREDFBDYPASCPSLPDQNNIWIRDHBD 
SGSVHLGTPGPSSGGLASQSGDNSSDQGVGLDTSVASPSSGGED 
EDLDQEPRRWKKRGI FPKVATNIMRAWLFQHI^HPYPSEEQKKQ 
LAQDTGLT1LQVNNWPINARRRIVQPM1DQSNRTX3QGAAP3PEG 
QPIGGYTBTEPHVAFRAPASVGDEFG rRKEEWHYL 
RWVREQPWPLKTSEAVKTP ALRPFPGPRGVtiPFPKPDWGKSPAg 
KRPFSDSGAFWSPERRPGVIiEAPRRRPVPASFRAVPPKPTRVHG 
SSASRDRVLARTMIVADSECRAELKDYLRPAPGGVGDSGPGEEQ 
KBSRARRGPRGPSAPXPVEEVLREGAESLEQHlxGLEALMSSGRV 
DNLAWMGLHPDYFTSFWRLHYLLLHTDGPLASSWRHYIAXMAA 
ARHQCSYLVGSHMAEFLQTGGDPEWLLGLHRAPEKLRKLSBINK 
LlAHRPWLITKEHIOAXiLKTGEHTWSLAELlQALVLLTHCHSI^ 
SFVFGCGILPEGDADGSPAPQAPTPPSEQSSPPSRDPLNNSGGF 
ESARDVEALMERMQQLQESLLRDEGTSQEEMESRFELiEKSESLL 
VTPSADXbEPSPHPDMLCFVEDPTFGYEDPTRRGAQAPPTFRAQ 
DYTWEDHGYSLIQRLYPEGGQLLDEKFQAAYSLTYNTIAMHSGV 
DTSVLRRAIWNYIKCVFGIRYDDYDYGEVNQLLERNIJCVYTKTV 

ACYPEKTTRRMyKLFWRHFRHSEmiVNIJ:jj:,BARMQAALI.YAL 
RAITRYMT 



LGGI.IXSRQRLLLKMGAGHLGAPMERHG RASAT5VSSAGEQAAGD " 

PEGRRQEPLRRRASSASVPAVGASAEGTRRDRLGSYSGPTSVSR 

QRVESLRIOCRPLFPWFGLDIGGTLVKLVYPEPKDITAEEEEEEV 

ESLKSIRKyLTSNVAYGSTGIRDVHI^LKDLTLCGRKOTCHFIR 

PPTHDMPAFIQMGRDKNPSSLHTVPCATGGOAYKFEQDFLTIGD 

X^QLCKLDBI.DCIxIKGXIiYIDSVGFNGRSQCYYFBNPADSEKCOK 

LPFDLKNPYPLLLVNIGSGVSIIAVYSKDMYKRVTGTSrx^GTF 

FGLCCLLTGCTTFEEALEMASRGDSTKVPKLVRDIYGGDYBRFG 

LPGWAVASSFGNMMSKEKRBAVSKEDIARATLITITNNIGSIAR 

MCAlJVEitfIKQWFVGKn:,RXWTrAMRLIAYAri3yWSKG3LKAL» 

SEHEGYFGAVGALLELLKIP ^ 
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SEQ 
ID 
NO: 


Predicced 
beginning 
nucleotide 
location 
corresponding 
to £irst 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
locd.t ion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" 
(A^Alanine, C=Cysteine, D^tAspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
H«Histidine, loleoleucine, KsLysine, 
LaLeucine, M«Methionine, N^Asparagine , 
P'^Proline, Q=Glutaniine, R=Arginiiie^ 
s=serine, T^Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, ♦=scop 
Codon, Apossible nu elect ids deletion, ■ 
\=spossible nucleotide insertion) 




554S 


802 


131 


GAMWSAGRGGAAWPVLLGLLLALLVPGGGAAKTGAELVfCGSVL 

KLLNTHHRVRUIS IIDI KYGSGSGQQS VTGVEASDDANSY WRIRG 

GSEGGCPRGSPVRCGQAVIUUTHVLTGKNLHTHHFPSPLSNNQEV 

SAFGEDGEGDDLDL.WTVKCSGCHWEREAAVRFOHVGTSVFLSVT 

GEQYGSPIRGQHEVHGMPSANTHNTWKAMBGIFIKPSVBPSAGH 
DEI* 




5546 


1592 


146 


fvprgghssmgqsgrsrhqkraraqaqlrnleXyaanphsfvft 

RGCTGRNIRQLSLDVRRVMEPtiTASRLQVRKKMSLKDCVAVAGP 
LGVTHFLILSKTETNVYFKIiMRLPGGPTLTFQVKKYSLVRDWS 
SLRRHRMHEQQFAHPPLLVLNSFGPHGMHVKIiMATMFQNliFPSI 
mrHKVNLNTIKRCt^LIDYWPDSQELDFRHYSlKVVPVGASRGMK 
KliLQEKFPNMSRLQDlSELIATGAGLSESEAEPDGDHbriTELPQ 
AVAGRGNMRAQQSAVRLTEIGPRKTLQLIKVQEGVGEGKVMFHS 
FVSKTEEELQAILE2VKEKKIiRI*KAQRQAQ0AQWQRKQEQREAH 
RKKSLEGMKKARVGGS0EEASGIPSRTASLELGEDDDEQBDDDI 
EYFCQAVGEAPSEDLFPEAKQKRLAKSPGRKRKRWEMDRGRGRIi 
CDQKFPKTKDKSQGAQARRGPRGASRDGGRGRGRGRPGKRVA 


■ \ 


5S47 


1592 


146 


FVPRGGHSEMGQSGRSRHQKRARAQAQLRNLEAYAAKPttSFVFT 
RGCTGRKIRQLSLDVRRVMEPLTASRLQVRKKNSLKDCVAVAGP 
LGVTHFI.ILSKTETNVYFKLMRI,PGGPTLTFQVKICYSI,VRDVVS 
SLRRHRMHEQQFAHPPLI.VLNSFGPHGMHVKIiMATMFQNLFPSI 
imiK^IOTIKRClJ:.IDYNPDSQELI>FRIfYSlKVVPVGASRGMK 
KIiI*QEKFPNMSRLQDXSBLLATGAGX*SBSEAEPDGDHNITEt»PQ 
AVAGRGNMRAQQSAVRI,TEXGPRMTI/QX,IKVOEGVGEGKVMPHS 
FVSKTEEELQAILEAKEKKLRIiKRQRQAQQAQNVQRKQEQREAH 
RKKSLEGMKKARVGGSDEEASGIPSRTASl.KIiGEDDnEQEDDDI 
EYFCQAVGEAPSEDLFPEAKQKRLAKSPGRKRKRWEMDRGRGRI* 
CDQKFPKTKDKSQGAQARRGPRGASRDGGRGRGRGRPGKRVA 


5548 


1 


2153 


IXiTGPPETIAFTFPRSTMEPLCPLLLVOFSIiPIiARAI.RGNEfTA 
DSNETTTTSGP PDPGASQPIiLAWIiLIiPIiliLLI»I.VribrAAYFPRP 
RKQRKAVVSTSDKKMPNGILEEQEOQRVMLIiSRSPSGPKKYFPl 
PVEHLEEEIRIRSADDCKQPREEFKSr.PSGHIQGTFEI«ANlCEEN 
REKWRyPWILPiroHSRVlI.SQLDGIPCSDyj[HASYXDGyKEKNK 
FIAAQGPKQETVMDFWRMVWEQKSATIVMLTNLKERKEEKCHQY 
WPDQGCWTYGNIRVCVEDCVVBVDYTIRKFCXQPQriPDGCKAPR 
LVSQLHFTSWPDPGVPFTPIGMLKPLKKVKT1.NPVXIAGPIVVHC 
SAGVGRTGTFJVIDAMMAMMHAEQiCVDVFBFVSRIRNQRPQMVQ 
TDNKJYTPlYQALI,EYYLYGDTEIiDVSSt*EKHI,QTMHQTTTHFDK 
IGLEEEFRKLTNVRIMKENMRTGNLPANMKKARVIQIIPYDFNR 
VILSMKRGQEYTDYINASFIDGYRQKDYFIATQGPLAHTVEDFW 
RMIWEWKSHTIVMLTEVQEREQDKCYQYWPTEGSVTHOEITIEI 
KNDTLSEAISIRBPLVTLNOPOARQBEQVRWRQPHPHGWPErG 
IPAEGKGMIDLIAAVQKQQQQTGNHPITVHCSAGAGRTGTFIAL 
SNILERVKAEGI/LDVFQAVKSLRLQRPHMVQTLEQYBFCYKWQ 
DFIDIPSDYANPK 


5545 


915 


256 


FEATGGKREAFKMAGTARHDREMAIQAKKKLTTATDPIERLRLQ " 

CLARGSAGXKGLGRVFRIMDDDNNRTLDFKEFMKGLNDYAWME 

KEEVEELFQRFDKDGNGTIDPNEFLLTLRPPMSRARKEVIMQAP 

RKLDKTGDGVITIEDLREVYNAKHHPKYQNGEWSEEQVFRKPIJ) 

N^JSPYDKI)GLWPEEFMNYyAGVSASII>TDVyFII^5^IRlAWKti 


5556 
5551 


2364 

211 


1210 
1700 


RKRKVFI.KMRRIiNRKKTI>SLVKEI4DAPPKVPESYVETSASGGTV 
SLIAFTTMALLTIMEFSVYQDTWMKYEYEVDKDFSSKLRINIDI 
TVAMKCQYVGADVLDrjAETMVASADGLVYEPTVPDT^PQQKEWQ 
RMLQLIQSRlXJEEHSLQDVXFKSAFKSTSTAliPPREDDSSQSPN 
ACRIHGHLYVNKVAGMPHITVGKAIPHPRGHAHLAALVWHESYN 
FSHRIDHLSFGELVPAIINPLDGTEKIAIDHNQMFQYFITWPT 
KbHTYKlSADTHQFSVTERERIINHAAGSHGVSGIFMKYDLSSIi 
KVTVTEEHMPFWQFPVRLCGIVGGIFSTTGMLHGIGKFIVEIIC 
CRFRIiGSYKPVNSVPFEDGHTDNHLPbliENWTH 








MQRDHTMDYKESCPSVSIPSSDEHREKJaaiPTVYICVl>VSVGRS|:' " 
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ID 

NO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to f irsc 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 

nucleotide 

location 

c orr e s pond i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 



Amino acid segment containing signal peptlH^ 
(A=Alanine, C=Cysteine, t>==Aspartic Acid, 2=. 
Glutamic Acid, F==?henylalanine, G=Glycine 
H^Histidine, I=lsoleucine, K=Lysine, 
L:sLeucine, M=Methionine, W::=Asp5i raging, 
P^ProIine, Q-Glutamina, R=Arginine, 
S=Serine, T»Threonine, V^^Valine, 
W-Tryptophan, Y-Tyroaine, X=tJnknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide Insertion) 



S552 



2748 



WFVFRRYAfc:trUKI.YNTLKKQ KPAMALKIgAKUiFGDNFDPDFXK 

QRRAGI,WEFIQNL^/RYPELYNHPDVRAFtQMDSPKHOSDr»SfiPE 

DERSSQKLHSTSQNINIiGPSGNPHAKPTDFDFLKVIGKGSFGKV 

LlAKRKLIXSK^'YAVKVl^KKIVLNRKEQKHIMRBRWLLKmnOT 

PFLVGLHYSFQTTEKLYFVLDPVNGGEIiFPHEiQRERSFPEHRAR 

FYAAEIASAI^YlJiSIKIVYRDLKPEKILLDSVGHWLTDttJLC 

KEGIAISDTTTTPCGTPEYLAPEVIRKQPYDNTVDWWCU3AVLY 

EMLYGLPPPYCRDVAEMXBN1LJJKPLSI.RPGVSLTAWS1LBELL 

EKDRQNRbGAKEDFLEIQNHPFFESLSWADLVQKKIPPPFNPNV 

AGPDDIRNFDTAFTEETVPYSVCVSSDYSIVNASVLEADDAFVG 
FSYAPPSEDLFIi 



LGKH^^W^^tiKiOiKKHKAEWRSSYEDYADK PLEKPLKLVLKVGG 
SEVTELSGSGHDSSYYDDRSDHERERHKEKKKKKKKKSEKEKHL 
DDEERRKRKEEKKRKREREHCDTEGEADDFDPGKKVEVEPPPBR 
PVRACRTQPAENESTPIQQLLEHFLRQLQRIOPHGPPAPPVTDA 
I APGYSMI I KMPMDFGTMKDKI VANEVKSVTEFKADPKLMCDNA 
MTYNRPDTVYYKIAKKILHAGPKMMSKQAALLGNEI)TAVBEPVP 
EWPVQVETAKKSKKPSREVISCMFEPEGNACSLTDSTAEEHYL 
ALVEHAADEARDRINRPLPGGKMGYtiKRNGDGSliI*YSVVNTAEP 
DADEEETHPVDLSSLSSKIiPGFTTLGFKDERRNKVTPl^SATT 
ALSMQMNSVFGDLKSDEMELI.YSAYGDETGVQCAX*SLQEFVK1>A 
GSySKKWp0l.LDQirGGDHSRTIiFQLKQRRNVPMKPPDEAKVG 
DTLGDSSSSVLEFMSMKSYPDVSVDISMLSSLGKViOCELDPDDS 
HLWLDETTKLLQDLHEAQAERGGSRPSSNLSSLSNASERDOHHL 
G3PSRLSVGEQPDVTHDPYEFLQSPEPAASAKT 



i.GREAVYLViiRMDGPVAEHAKQEPFHW'rPJbLBSWAl*SQVAGMP 
VFIiKCENVQPSGSFKIRGIGHFCQEMAKKGCRHLVCSSGGNAGI 
AAAYAARKLGZ PAXl VZ^PESTSLQ WQRLQGEGAEVQLTCKVWD 
EANI^RAQEIiAKRDGWBNVPPPDHPLTWKGHASWQELKAVLRTP 
PGALVLAVGGGGLLAG WAGLLEVGWOHVPI lAMElHGAHCFNA 
AITAGKLVTLPPITSVAKSLGAKTVAARALECMQVCKIHSEWE 
DTEAVSAVQQLLDDERMIiVEPACGAAlAAIYSGLLRRLQAEGCI* 
PPSLTSWVI VCGGNNlNS RELQALKTHtiGQV 



55SS 



212 



5SS6 



5835 



142S 



3346 



CSURTGGRGSDRPAEWVCLTCKLSGAETRGLLCPALRTWIMKVL 

GRSFFWVLFPVXPWAVQAVEHEEVAQRVIKLHRGRGVAAMQSRQ 

WVRDSCRKLSGLLRQKNAVlJJfKLKTAIGAVEKDVGLSDEEJXFXJ 

VHTPEIFQKFUaESENSVFQAVYGLQRALQGDYlCDVVNMKESSR 

QRLEALREAAIKEETEYMELLAAEKHOVEAJbKNHQHQNQSLSMI, 

DEILEDVRKAADRLEEBIEEHAFDDNKSVKGVKFEAVLRVEEEE 

ANSKQNrTKREVEDDLQLSMIilDSQWNQYILTKPRDSTIPRAlJH 

HFlKpiVTIGMI,SLPa5WI,CTAIGLPTMFGYIICGVLtiGPSGI^ 

SIKSIVQVETLGEFGVFFTI,Fl.VGLEFSPEKLRKVWKXSIOGPC 

YMTLLMlAFGLI.WG:^aiUllKi>TQSVPISTCliSLSSTPLVSRFLM 

GSARGDKEGDlDYSTVLLGMI*VTQDVQI.G3UFKAVMPrLIQAGAS 

ASSSIWBVLRlLVLIGQlLFSrAAVPLLCLVIKKYLIGPyyRK 

IJ3MESKGWKEILILGISAFIFI«MLTVTEr*IJ5VSMEliGC:FIA^^ 

VSSQGPWTEEIATSIEPIRDFLAIVFFASiarjJVrPrFVAYEIi 

TVLVFLTLSWVMKFLLAALVLSr^inPRSSQYIKWrvSAGlAQV 

SSFSFVLGSRARRAGVISREVYLtilliSVmiSLLUVPVLWRAAt 
TRCVPRPERRSSL 



LSl^TREU-PAPPRCEAASQGRVGWRAP AAAEEAVRSVWNRTRPR " 

GTMAPQNLSTFCLhLLYLIGAVlAaRDFYKlLGVPRSASlKDiK 

KAYRKLALQLHPDRNPDDPQAQSKFQDLGAAYEVIiSDSEKRKQY 

DTYGEEGLKDGHQSSHGDIFSHFFGDFGFMEX5GTPRQQDRNIPR 

GSDIIVDLEVTLEEVYAGNFVEWRNKPVARQAPGKRKCNCRQE 

MRTTQLGPGRFQMTQEVGCDECPWVKLVNEERTLEVEIEPGVRD 

GMEYPFIGEGEPHVDGSPGDLRPRIKWKHPIPERRGDDI^YTNV 

TlSLVBSLVGFEMDITHLTXSKKmiSBDKrrRPGAKLmKGEGL 

PNFDNNNlKGSLIITFDVDFPKEQLTEEAREGIKQtLKQQSVQK 
VYUGhQGY 

RTRGMS KNCVPfaEFEE YliLRMFQGtFYLIK^KI TKDNNAHTVkS 1^ 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

CO r respond i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino 
sequence 


Ammo acid segment containing signal'peptide " 
(A-Alanine. C^Cystelne, D=Aspartic Acid 
Glutamic Acid, F-Phenylalanine, G^Glycine 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, W=Asparagine, 
P=Proline, Q^Glutamine. R=Arginine, 
S -Serine, T=Threonine. V^Valine, 
W=rryptophan, Y-Tyrosine, X-Onknown, *=stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








i.EELDESYiJiKP-rDFIU?LPVSVHLRRIBSy§5Fpwmm:Fir^ 

YTFHQPTHEGYFSCLDIWTLFLDYLTSKIKSRLGDKEAVI.NR^E 

DALVLLLTEVLNRIQFRyjjQAQLEELDDETLDDDQQ'TEWQRYLR 

QSLEWAJCVMELLPTHAFSTr.PPVLQDNI>EVYLGI/QQFXVTSGS 

vjn«_u>iaxA]iwi^LJ<KIiHCSl4RDltSSLLQAVGRIiAEYPlGDVFAAR 

FNDALTWERLVKVTLYGSQIKLywiETAVPSVLKPDLIDVHAQ 

SIAALQAYSHWIAOYCSEVHRQNTQQFVTLISTTMDAITPLIST 

KVQDKLLLSACHLLVSLATTVRPVFLI5ZPAV0KVPNRITDASA 

I«RLVDKAQVLVCRAI>SNI IXLPWPNLPENEQQWPVRS IMHASLI 

SALSRDYRNLiCPSAVAPQRKMPLDDTKLIIHQTLSVI.EDrVENI 

SGESTKSRQICYQSLQESVQVSIALFPAFIHQSDVTDEMLSFFI* 

TLFRGIiRVQMGVPFTEQIIQTPUOMFTREQIAESILHEGSTGCR 

WEKFLKILQVWQBPGQVFKPFLPSIIALCMEQVYPIIAERPS 

PDVJCAELFELLFRTLHHNWRYFFKSTViASVQRGIAEEOMENEP 

Ut ^AlMQAFGQSFLQPDIHLFKQNLFYLBTLNTKQKLYHKKIFR 

TAMLFQPVNVLLQVLVHKSHDLLQEEIGIAIYNMASVDFDGFFA 

AFLPEFLTSCDGVDAWQKSVLGRNFKMDRVRRERGRAKRRABWA 

RKPGTCAARRGHIEA3GRGLCPPCSIAAAHEMPADLVL 


5SS7 


1712 


491 


VlbGAGLRUKDMWIPWGbPKRLRLSALAGAGRFCILG'SEAATR" 

KULPARNHCGI^SDSSPQLWPEPDFRNPPRKASKASLOFKRYVTD 

RRLAETlAQIYLGKPSRPPHLLLECNPGPGILTQALbBAGAKVW 

AI.ESDKTFIPHLESLGKITLDGKLRVIHCDFFKLDPRSGGVIKPP 

AMSSRGLFKNLGIEAVPWTADIPLKWGMPPSRGEKRALWKrAY 

DLYSCTSIYK:FGRIEVNMPIGEKEFQKLMADPC!NPt)I.YHVLSVI 

WQLACEIKVLHMEPWSSFDlYTRKGPr^NPKRRErJbDQLQQKLY 

ij-tywA f KUWijp i^iCNIiTPMNYNIFFHLLKHCFGRRSATVIDHtiRS 

LTPLDARDILMQIQKQEDHECWNMHPQDFKTLFETIERSKDCAY 
KWLYDETIjEDR 


sssa 


1509 


96 


KAGCrHPQVPADI^APAEPt^PQKTCVCLIsQPQPGGORGPTTMr" K 

TGVFSMRLWTPVGVLTSLAYCLHQRRVALAELQEADGQCPVDRS 

LLKLKMVQWFRHGARSPLKPLPLEEQVEWNPQLi:,BVPPQTQFD 

YTVTlfl^GPKPYSPYDSQYHETTLKGGMPAGQLTKVGMQQMFA 

LGERI^RKNYVEDtPFLSPTPNPQEVFlRSTNIFRNLESTRCDIJ^ 

GLFQCQKEGPIIIHTDEAD3EVI,YPNYQSCWSliRQRTRGRRQTA 

SI^PGrSEDI,KKVKDRMGrr>SSDKVPFFir^WAAEQAHNLPS 

CPMLKRFARMIEQRAVDTSLYHiPKEDRESLQMAVGPFLHrLES 

NLBKAMDSATAPDKIRKIiYLYAAHDVTPIPIiLMTLGIFDHKWPP 

FAVDLTMELyQHLESKEWFVQLYYHGKEQVPRGCPDGflK::PIJ)MP 

liNAMSVYTLSPEKYHALCSQTQVMEVGNEE 


55S9 
SS60 " 


ISO 


1983 


PLAATAHFAKMSRVAKYRROVQPlSunTnc'T T F'Tn'qpc'PMgpT tjtf — 

ELDWDPDGSVPVGLRQRNQTEKQSTGVYNRKAMLNFCEKETKK 

LMQREMSMDESKQVETKTDAKNGEERGRDASKKALGPRRDSDLG 

KEPKRG6LKKSFSRDRDEAGGKSGEKPKEEKXIRGIDKGRVRAA 

VDKECEAGKDGRGEERAVATKKEEEKKGSDRNTGLSRDKDKKREE 

MKEVAKKEDDEKVKGERRNTE>TRKEGEKMKRAGGNTDMKKEDEK 

VKRGTGNTDTKXDDEKVKKNEPLHEKEAKDDSKTKTPEKQTPSG 

PTKPSEGPAKVEEEAAPSIFDEPLERVKNNDPEMT3VNVNNSDC 

ITWElIiVRFTEAI^PtnrVVKIiFALANTRADDHVAPAIAIMLKAN 

KTITSLNUDSNHlTGKGlLAIFRALLQNimiTELRFHETORHICG 

GKTEMElAJaDKENTTIiLKLGYHFEIAGPRPrrVTWi^RWProKQ 

RQKRLQEQRQAQBAKGEKKDLLEVPKA6AVAR6SPKPSPQPSPK 

PSPKWSPKKGGAPAAPPPPPPPLAPPLIMENLKNSLSFATORKM 

GDKVLPAQEKNSRDQLIxAArRSSNLKQLKKVEVPKLLQ 


5561"" 


9 

2175 


921 

< 
I 

i 

177S ( 


SSWEFSALfeVSMACbSPSQtiQKPQODGFLVIaEGPliSAEECVAM ~ 
aQRIGElVAEMDVPLHCRTEFSTQEEEQLRAQGSTDYFIiSSGDK 
tRFFFEKGVFDEKGNFLVpPEKSINKIGHAIiHAHDPVFKSITHS 
FKVQTIARSI/SLQMPWVQSMYIFKQPHFGGEVSPHQDASFLYT 
SPIX3RVLGVWI AVEDATLENGCLWFI PGSHTSG VSRRMVRAPVG 
3APGTSFLGSEPARDNSLFVPTPVQRGALVLIHGEWHKSKQNL 
3DRSRQAYTFHLMEASGTTWSPENWLQPTAELPPPQLYT f 
-YFIFQFFSSPypGLHPHQTPAPLPNPGLYPPPVSMSPGQPPPQ 
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SEQ 
ID 
NO: 



Predicted 
; beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
ounino acid 
residue of 
amino acid 
sequence 



^mo acia segment containing sxgnal pepti -gg" 
(A^Alanme, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F^^Phenylalanine, G«=Glycine 
H^Histidme, I«Isoleucine, K=Lyfiine, 
L=^Leucine, M^Methionine, N«=Asparagine 
P**Proline, Q^Glutamine, R=Arginine, 
Sa^Serine, T^Threonine^ V^Valine, 
W-Tryptophan, Y-Tyrosine, X-UnkAown. *^Stop 
Codon, /^possible nucleoti.de deletion, 
^possible nucleotide insertion) 



SS62 



UblAPTyFSAPGVKNFGNPS YPYAPGALPPPPPPHLYPNTOAPS " 
QVYGGVTYYMPACJQQVQPKPSPPRRTPQPVTIKPPPPBWSRds 



5563 



342 



1385 



5564 



914 



SS65 



993 



138- 



SSGKyDMAAAcjAAHLVRGL KAGVLSQADYLMLVQCETI^EDIJajf 

I/QSTDYGNFIiANEASPL-mviDDRLKErowVEFRHMRmUYE 

UVSFLDFITYSYMIDNVILLITGTLHQRSIAELVPKCHPLGSFE 

OMEAVNIAOTPAELyWAILVDTPLAAFFQDCISBQDLDEMNIEI 

IRNTLYKAYLESPYKFCTLLGGTTADAMCPILEPEADRRAPIIT 

INSPGTEUSKEDRAKLPPHCGRLYPEGLAOIARADDYEQVKNVA 

DYYPEYKI^FEGAGSNPGDKTLEDRFFEHEVKLNKtAPUJQFHF 

C3 VFYAFVKLKEQSCRNIVWIAECI AQRHRAKIDNYIPTF 

LQSTDYGNPLAMEASPLTVSVIDDRUCEKMWEFRHMRNHAYEP 
LASPLDFITYSYMIDNVILLITGTLHQRSIABLVPKajPLGSPE 
QMEAVNIAQTPAELYNAILVDTPIAAPFQDCISEQDU5BMNIEI 
IRNTLYKAYLESFyKFCTLLGGTTADAMCPlI,EFEADRRAFI IT 
INSFGTBr^KEDRAKLFPHCXSRLYPEGIAQLARADDYEQVKNVA 
DYYPEYKIJGFEGAGSNPGDKTI^DRPFEHEVKIjnaAFLMOFHF 
GVFYAFVKLKEQECRNIVMIAECIAQRHRAKIDNYIPIP 



SSesn 2043 



1232 



KVKRDKRAVwTARGIU^RC GDSMSGGWMAQV/GAWRTOAI^IAIiIJ. 
LLGUSLGLEAAASPLSTPTSAQAAGPSSGSCPPTKFQCRTSGLC 
VPLTWRCDRDLDCSDGSDEEBCRIBPCTQKaQCPPPPGLPCPCT 
GVSDCSGGTDKKI^RNCSRlACtAGBURCTbSDDCiPLTWRCDGH 
PDCPDSSDEI.GCGTNE1LPEGDATTMGPPVTLESVTSLRKATTM 
GPPVTLESVPSVGNATSSSAGDQSGSPTAyGVIAAAAVI^ASI.V 
TATLI,LtiSWLRAQERLRPI,GLL VAMKESLLLSEOKTST.P 

'kWNSPNPARAGSiaRPQRAP GiiVSAVAI4TAAVFFGCAFXAFGP "A 
IALYVPTIATEPIJiIIFLlAGAPFWI.VSI.t,lSSI,VWPMARVIlD 
NKDGPTQKYLLI FGAFVSVYIQEMPRPAYYKLLKKASEGLKS IN 
PGETAPSMRLLAYVSGLGFGIMSGVFSFVNTLSDSLGPGTVGIH 
GDSPQFFLYSAFMTLVI ILLHVFWGIVPFDGCEKKKWGILLI VL 

LTHLr.VSAQTFISSYYGINlJ^AFlII.VLMGTWAFI»AAGGSCRS 
LfOiCLLCQDKNFIiLYNQRSK 



shiqhhgrgaqapvkmvswmisraw i^vfgmlypayysykavkt 

K3mCEyVRWMMYWIVVALYTVIETVADQTVAWFPr.yYELKIAFV 
iwrMPYTKGASLlYRKFLHPLLSSKEREIDDYlVQAKBRGYET 
MVNFGRQGLNLAATAAVTAAVKSQGAITERraRSFSMHDliTTIQG 
DEPVGQRPYQPLPEAKKKSKPAPSESAGYGIPLKDGDEfCrDEEA 
EGPYSDNEMLTHKGPRRSQSMKSVKTTKGRKEVRYGSLKYKVKK 



SS€8 



1731 



587 



i::Fl^CiVSPDLANKDGLTALHQC CJI3PFREMVQQLLEAGANINA 
CDSECWTPLHAAATCGHtJHLVEIJLIASGANIJ:AVNTDGNMPYDL 
CDDEQTIJ>CI^AMADRGITQI>SIEAARAVPEU?MLD0IRSRLQ 
AGADLHAPLDHGATLLHVAAAWGFSEAAAUCiEHRASLSAKDQD 
GWEPLHAAAYWGQVPLVELLVAHGADLNAKSLMDETPLDVCX3DE 
EVRAKLIiELKHKHDAI^LRAQSRQRSLLRRRTSSAGSRGKWRRV 
SLTQRTDLYRKQHAQEAIVWQQPPPTSPEPPEDNDDRQTtJAELR 
PPPPEEDNPEWRPHNGRVGGSPVRHLYSKRLDRSVSYQt^PLD 

STTPHTLVHDKAHHTLADLKRQRAAAKLQRPPPEGPESPETABP 
GLPGDTVTPQPDCGFRAGG DPPLLKLTAPAVFRPVRPPP^r^r.T.T^ 



5569 I 



83S 



AEDRQPASRkLiAGTrAAMAASGPGCRSWCLCPKVPSATFFTALI, 
SliI,VSGPRI»FLLQQPIiAPSGLTl,KSEAI*RNWQVYRr.VTYIFVYE 
KPrSLLCGAtrrWRFAGJIPERTVGTVRHCPFTVIFAIFSAIIPL 
SPEAVSSLSKLGEVEDARGFTPVAFAMZ^VTTVRSRMRRALVFG 
MWPSVLVPWLLLGASWUPQTSFLSilVCGLSIGrAYGLTYCYS 
IDI^ERVALKLDQTFPFSLMRRISVFKYVSGSSAERRAAQSRKL 
NPVPGSYPTQSC34PHLSPSHPVSOTQHASGQKIASWPSCTPGHM 
PTLPPYQPASGLCYVQNHFGPNPTSSSVYPASAGTSLGIQPPTP 
VWSPGTVYSGAU3TPGAAGSKESSRVPMP 



QTPCPIAWERGSRsE D^ISyPGOKPPrCSS ^-^«MDVnP.Q<;T.ptJTr^- i ^ 
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SEQ 
ID 
NO * 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 

<llll JL IKp^ d U JL u 

sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
setjuence 


Ammo acid segment containing signal peptide"" 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F*=Phenylalanine, G=:Glycine, 
H-Histidine, l«Iooleucine, K«Ly:3ine, 
L=Ijeucine , M=Methionine , N=Asparagine , 
P=.Proline, Q=Glutamine, R==Arginine, 
S=5erlne, T= Threonine, V:=Valine, 
WaTryptophan, Y^'iyroslne, Xx:Unlcnown, *=stop 
Codon, /^possible nucleotide deletion, ' 
\«possible nucleotide insertion) 








bKLIiLLLLLLPLRGQANTGCYGIPGMPGIaPGAPGKDGYDGLPGP 

KGEPGIPAIPGIRGPKGQKGEPGI.PGHPGKKGPMGPPGMPGV^ 

PMGIPGEPGEEGRYKQKFQSVPTVTRQTHQPPAPNSLIRFNAVL 

TNPQGDYITrSTGKFTCKVPGLYYFVYHASHTANLCVIXYRSGVK 

WTFCGHTSKTNQVNSGGVLLRLQVGE£VWLAVWD^YD^fVGIQG 
SDSVFSGFLLFPD 


5570 


264 


94 6 


RDRRDRGGVATSTEBPARPRAPQSRGPGPVSQTGRGRERGGGDT" 

MSSPSPGKRRMDTDWKLlESKHEVTItiGGLNEPWKPyGPQGT 

PYEGGVWKVRVDLPDKYPFKSPSIGFMNKI FHPM IDEASGTVCL 

DVINQTMTALVDIiTMIFESFLPQLLAYPNPIDPJLWGDAAAMyLH 

RPEBYKQKIKBYIQKYATEEAI*KEQEEX5TCDSSSESSMSDFSED 

EAQDMEI* 


SS7i 




946 


RORRDRGGVATSTEEPARPRAPQSRGPGPVSQTGRGRBRGGGDT ' 
MSSPS PGBCRRMDTDWKLI BSKHEVT1U3GLNB FWKFYGPOGT 
PYEGGVWKVRVDLPDKYPFKSPSIGFMNKIFHPNIOEASGTVCL 
PVIWQTVrrAIiYDliTNIFESFbPQLIiAYPNPIDPLNGrwvAMYIiH 

RPEBYKQKIKEYIQKYATEEALKEQEEGOWDSSSEBSMSDFSED 
EAQDMEL 


5S72 


2802 


2085 


RTDYRTGIPGRRFRVMAAGIX3DVKIiGTIiGSGSESSm)GGSESPG 
DAGAAAEGGGWAAAAIALLTGGGEMI.IiNVALVALVLU3AYRLWV 
RMGRRGi.GAGAGAGEESPATSr.PRMKKRDFSI.E<2IiRQYDGSRNP 
RILLAVNGKVFDVTKGSKFYGPAGPYGIFAGRDASRGIiATFCtiD 
KDALRDEYDDLSDLNAVQMESVREWEMQFKEKYDYVGRliKPGE 
EPSEYTDEEDTKDHNKQD 


■ ss^i 


2S62 


219 


VPARTPNAEDQGPEARAATATPCQSGGRERAGEAAED6VIQWAAF 
SEMGVMPBIAQAVEEMDWLI^PTDtQTlESIPLILGGGDVIiMAAET 
GSGKTGAFSIPVIQIWETLKDQQEGKKGKXTIKTGASVLNKWQ 
MNPYDRGSAFAIQSDGLCCQSREVKEWHGCRATKGIiMKGKHYYE 
VSan)QGLCRVGWSTMQASLDLGT0KFGJPX3FGGTGKKSHWKQn) 
NYGEEFTMHDTIGCYLDIDKGHVKFSKNGKDLGIAFEIPPHMXN 
QALFPACVLKNAELKFNFGEEEFKFPPKDGFVALSKAPDGYIVK 
SQHSGNAQVTQTKFLPKAPKALIVEPSRELAEQTLNNIKOFKKY 
IDNPKLRELLI IGGVAARDQLSVLENGVBIWGTPGRUJDLVST 
GKLNLSQVRFLVLDEADGLLSQGYSDFINRMHKQIPQVTSDGKR 
LQVIVCSATLHSPDVKKLSEK^^ffiFPTWVDLKOEDSVPI>TVHHV 
WPVNPKTDRLWERliGKSHIRTDDVHAiCDimiPGANSPEMWSEA 
IKILKGEYAVRAIKEHKMDQAIIFCRTKIDCDNLEQYFIQQGGG 
PDKKGHQFSCVCLHGDRKPHERKQNLERFKKGDVRFLICTDVAA 
RGIDIHGVPYVIWVTLPDEKQNYVHRIGRVaRAERMGLAISLVA 
TEKEKVXflYHVCSSRGKGCYNTRLKEIXl^GCTlWYIIEMQLLSEIEE 
HI,NCTISQVEPDI KVPVDEFDGKVTYGQKRAAGGGS YKGHVDI L 
APTVQEIAALEKEAQTSFLHLGYLPNQLFRTF 


S574 


1731 


952 


KEGLEVFKEQELQPSDKGAVPEDASTERSAMASLGLQIiVGYILG 
LLGLtiGTLVAKLLPS WKTS S YVGAS 1 VTAVGFS KGLWMBCATHS 
TGITQCDIYSTLhGhPADIQAAOmtiVTSSAlSSlACXXSVVGM 
RCTVFCQESRAKDRVAVAGGVFFILGGLLGFrpVAWNtHGILRD 
PYSPLVPDSMKFEIGEALYLGI ISSLPSLIAGI ILCFSCSCQRK 
RSNYYDAYQAOPLATRSS PRPGQPPKVKSEPNS YSLTGYV 


5575 


45G 


766 


LLWALPCPPPTAAA\nt4L3STGIiMEIJbEKMIiALTLAKAD^ 

LCSAWLLTASFSAQQHKGSIiQKDPLLSQACVGCLEAUliDyrJOAR 
SPDIGRNSPHYLMFP 


5576 


249 


2146 


RSWGAPWFi^RIiRRRHMPLRI^VtSCAFVLFLFttHSDVS^R"^ 

EEATEKPWUCSLVSRKDHVLDLMLEAMWNLRDSMPKLQIRAPEA 

QQTLFSINQSCLPGPyXPAELKPPWERPPQDPNAPGADGKAFQK 

SKWTPLETQEKEEGYKKHCFNAPASDRISLQRSLGPDTRPPECV 

DQKFRRCPPLATTSVXIVFHIJEAWSTLLRTVYSVLHrrPAILLK 

EI ILVPDASTEEHLKEKLEQYVKQLQWRWRQEERRGLITARL 

LGASVAQAEVLTFLDAHCECFHGWLEPLIARIAEDKTVWSPDI 

VTIDLNTFEFAKPVQRGRVHSRGNFDWSLTFGWBTLPPHEKQRR 

KDETYPZKSPTFAGGLFSlSKSYPEHIGTYDNQMEIWGGENVEIif 
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SEQ 
ID 
NO: 


Predicted 
beginaing 

location 
corresponding 
CO first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo &cx(i segment containing signal peptrHS— 
(A=Alanine, C-Cysteine, D=Aspartic Acid, 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, Ks^Lysine, 
I.=Leucine, M^Methionine, N^Asparagine, 
P=Prolina, Q=Glutamine, R=Arginine, 
S-Serine, T=:Threonine , V=Valine, 
W«Tryptophan, Y^Tyroslne, X^Vnknova, **Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








" SFRVWQCGUQLEIIPCSVVGHVFRTKSPHTFPKGt^VIARNQVR ' 

LAEVWMDSYiCKIFyRRNi,QAAKMAQEKSKGDISERLQLRBQLHC 

HNPSWYLHNVyPEMFVPDtTPTFyGAlKNI/STNOCLDVGBNURG 

GKPLlMYSCHGLGGNQYFEYTTQRDLRHWIAKQbCUIVSKGALG 

LGSCHFTGKWSQVPKDEEWBIAQDQLXRWSGSGTCLTSQDKKPA 
MAPCNPSDPHQLWLFV 


5577 
5578 


3 


1275 


RNSDCSCGKiSVHCLPWVLFlLDLKVKSSMFCPLKLlLliFm^D" 

YSLSLNDUJVSPPELTVHVGDSALMGCVFQSTEDKCIPKIDWTL 

SPGEHAKDEYVLYYYSNLSVPIGRFQNRVHLMGDILCNOGSI,Lt 

QDVQEADC?GryiC£;lRl,KGESQVFKKAVVl.HVLPBBFKEI*IVHV 

GGLIQMGCVFQSTEVKHVTKVEWIFSGRRAKEEIVFRYYHKLRM 

SVEYSQSWGHFQNRVNLVGDIFRKDGSXMLQGVRESDGGNYTCS 

rHI.GNLVFKKtIVLHVSPEEPRTLVTPAALRPLVXX5GNQI,VIIV 

GIVCATII^LPVLILIVKKTCXTNKSSVNSTVLVKNTKKTNPEIK 

EKPCHFERCEGEKHIYSPIIVREVrEEEEPSEKSKATYMTMKPV 

WPSLRSDRNNSLEKKSGGGMPKTQQAF 




3 


783 


AVESMAS PvjAUi<Af FELPERNOGYREVEYWIXJRYQGAADSAPYD" 

WFGDPSSFRALLEPEIiRPEDRlI.Vt*GCGNSAI*SYBLFLGGFPNV 

TSVDYSSVWAAMQARYAHVPQLRWETMDVRKIiDFPSASFDWL 

EKGTLDALLAGERDPWTVSSEGVHTVDQVLSEVSRVIiVPGGRFI 

SMTSAAPHFRTRHYAQAYYGWSLRHATYGSGFHFHLYLMHKGGK 

LSVAQLJ^LGAQILSPPRPPTSPCFLQDSDHEDFLSAIOI. 


5579 


3 


1S40 


rnsgi^gasaxj\rhgggiaggvgv«x;gacasrcqgvmeglltr 

CRALPAIATCSRQLSGYVPCRFHHGAPRRGRRLWL^SRVFQPQNi;. 

nEDRVIiSLODKSDDI.TCKSQRI,MLQVGr.iyPASPGCYHLLPyTV 

RAMEKLWVIDQEMOAIGGQKVNMPSIiSPAELWQATTfRWDIiMGK 

ELLRLRDRHGKSYCX.GPTHEEAITAriIASQKfCLSYKQLPFLLYQ 

VTRKFRDEPRPRFGLLRGRfcrFYMKDMYTPDSSPEAAQQTYSLVC 

DAYCSLFNKLGriPFVICVQADVGTIGGTVSHEPQLPVDiGEDRLA 

ICPRCSFSANMETLDLSQMNCPACQGPLTKTKGIEVGHTPYIiGT '"^ 

KYSSIFNAQFTNVCGrPTlJ^K<3CYGI/3VTRriAAAIEVl.STBD 

CVRWPSLLAPYQmC^IPPICKGSK£QAASEI,IGQLYDHITEAVPQ 

LHGEVLLDDRTHLTIGNRLKDANKPGYPFVirAGKRAIiEDPAHF 

EVWCQMTGBVAFLTKDGVMDLLTPVQTV 


S580 


1681 


450 


ADAGTRCIPGFWPSGAGYSAPAQRGRRSSGfeMfeAAAAPGLTAP 
WRLLQCCELEAGELGMAVPAAj^MGPSALGQSGPGSMAPWCSVSS 
GPSRYVI^MQELFRGHSKTREFLAHSAKVHSVAWSCDCRRIASG 
SFDKTASVFLt,EKDRLVKENNYRGHGDSVDQI,CWHPSNPDI,FVT 
ASGDKTIRIWDVRTTKCIAIWTKGENINICWSPDGQTIAVGNK 
I)DVVTFinAKTHRSKAEEQFKFEVNEISWNNDNNMFFl.TNGNGC 
INILSYPELKPVOSINAHPSNCICIKFDPMGKYFATGSADALVS 
LWDVDELVCVRCFSRLDWPA^TLSFSHDGKMLASASEDHFIDIA 
EVETGDKLWEVQCESPTFTVAWHPKRPLI4AFACDDKDGKYDSSR 
EAGTVKliFGIiPKDS 


5581 
5582 


54 


947 


GGGSGPRAPSATI4LDTGESVAAVASGEDKGIAASAAAAAVFACS 
CSPDPQSSTMNPVYSPVQPGAPYGNPKNMAYTGYPTAYPAAAPA 
YNPSLYPTNSPSYAPEFQFI/HSAYATliLMKQAWPONSSSCGTSG 
TFHLPVDTGTENRTYQASSAAFRYTAGTPYKVPPTQSNTAPPPY 
SPSPNPYQTAMYPIRSAYPCJQNLYAQGAYYl^JPVYAAQPHVIHH 
TTWQPHSIPSAIYPAPVAAPRTNGVAMGMVAGTTMAMSAGTLL 
TTPQKTArGAHPVSMPTYRAQGTPAYSYVPPHW 




5775 


2739 

J 


IITNNNNVIIPbviAYHLSGSAQARGERSPAERIJ^ERQKRKADI ' 

EKGLQFIQSTLPLKQEEYEAFLLKLVQKLFAEGNDLFREKDYKQ 

ALVQYMEGZJJVA0YAASDQVALPRELLCKLHVNRAACyPTMGLY 

EKALEDSEKALGliDSESlRALFRKARAiNEI/SRHKEAYECSSRC 

SlALPHDESVTQLGQELAQKLGLRVRKAYKRPQELETPSLIiSNG 

rAAGVADOGTSNGIiGSIDDXETDCYVDPRGSPAIjLpSTPTMPLF 

PHVLDLLAPUaSSRTLPSTDSLDDFSDGDVFGPELDTLIiDSLSL 

/QGGJCSGSGVPSELPQLIPVFPOGTPLLPPWGGSlPVSSPIiPP 

iSPGLVMDPSKKIAASVLDALDPPGPTLDPtjDLliPYSErRLDAi 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re spond i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" 
(A=Alanine, C^Cysteine, D==Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G:=Glycine, 
H^Histidine, l=Isoleucine, K=Lysine, 
Ii=Lreucine, M=Methionine, Ni=Asparagine, 
P«Proline, Q=Glutamine, R«Arginine, 
S=Serine, T..Threonine, V^Valine^ 
W^Tryptophan, y=.Tyrosine, X^Unknov/n, *=Stop 
Codon, /..possible nucleotide deletion, 
\ -possible nucleotide insertion) 










Ubt(oi>iK<jriiliUKi'Utjll?riKlirrNSQDHRPPSGAQKPAPSPEPCMPN~ 
TALLIKNPLAATHEFKQACQLCYPKTGPRAGDrrYREGLEHKCK 
RDrLLGRLRSSEDQTWKRrRPRPTKTSPVGSYyLCKDMINKQpC 
KYGDMCTFAYHQEEIDVWTEERKGTLNRDLbFDPLGGVKRGSLT 
lAKLLKEHQGIFTFIiCEICFDSKPRIlSKGTKDSPSVCSNLAAK 
HSFYNNKCIiVHIVRSTSI^KYSKlROFQEHFOFDVCRHEVRYGCL 
REDSCMFAHSFIELKVWltLQQYSGMTHEDIVQESKKYWQQMEAH 
AGKASSSMGAPRTHGPSTFDLQMKFVCGQCWRNGCWEPDKDLK 
YCSAKARHCWTKERRVI.LVMSKAKRKWVSVRPLPSIRNFPQQyD 
I.CXHAQNGRKCQYVGWCSFAHSPEERDMWTPMKENKILDM<5QTy 
DMWLKKHNPGKPGEGTP r S SREGEKQIQMPTDYADl MMGYHCWL 
0GKMSNSKKQWQ0HIQSEKHKEKVFT3DSDASGWAPRFPHGEFR 
LCDRliQKGKACPDaDKCRa^HGQEEnNEWLDRRBVLKQKLAKAR 
. J^^l^'^PRl^OrJFGKYNFI.LQEDGDIaAGATPEAPAAAATATTCE 




5583 


3 


1265 


SSGCRQGRPGRSDRPRPPPRRHKMVKBTRYYDILGVKPSASPEE 
IKKAYRKIiAIiKYHPDKKPDEGEKFiaiSQAYEVLSDPKKRDVYD 
QGGEQAIKEGGSGS PSPSS PMDIFDMFFGGGGRMARERRGKNW 
HQLSVTLEDLYNG\n'KKIALQKNVICEKCEGVGGKKGSVEKCPIi 
CKGRGMHlJJIOQIGPGMVQQXQTVCIECKGQGERINPKDRCesC 
SGAKVI REKKI I EVHVEKGMKDGQKII,FHGEGDQBPELEPGDVI 
iVJUDQKDHSVTQRRGHDLIMKMKIQLSEAXiCGFKKTIKTIiDNRI 
LVITSKAGEVIKHGDIiRCVRDEGMPIYKAPLEKGILIIQFLVIF 
PEKI«^LSIiEKIiPQi:jEAI.LP?RQKVRITDDMDQVELKEFCPNEC^ 
WRQHREAYEEDEDGPQAGVQCQTA 


i 


SS84 


3 


12^5 ■ • 


SSGCRQGRPGRSDRPRPPPRRHKMVKETRYYDILGVKPSASPEE 
IKKAYRKtALKYHPDKNPDEGEKFKLISQAYEVLSDPKKRrVYD 
QGGEQAIKEGGSGSPSFSSPMDIFDMPFGGGGRMARERRGKNW 
nQhSVTLEDLYmVTKKtAhQIQWICEKCEGVGGKKGSVEKCPh 
CKGRGMHIHIQQIGPGMVQQIQTVCIECKGQGERINPKDRCESC 
SGAJCVXREKKIIEVHVEKGMjCDGQKXI.FHGEGr>QEPELEPGDVI 
IVLDQKDHSVPQRRGHDIilMKMKIQLSEALCGFKKTXKTLDKRI 
LVITSKAGEVIKHGDLRCVRDEGMPXYKAPLEKGXLIXQFLVXP 
PEKHWLSLEKLPQLEAIiLPPRQKVRlTDDMDQVELKEFCPNEQH 


5585 


2619 


91S 


LPAGTPESSLHEAI,DQCMTAI*DLFI,TNQFSEALSYIJCPRtkESM 
YHSLTYATXLEMQAMMTFDPQDIXaUAGNMMKEAaMLCQRHRRKS 
SVTDSPSSLVKRPTLGQFTEEEIHAEVCYAKCLLQRAALTFXQD 
ENMVSFrKGGIKVRNSYQTYKELDSLVQSSQYCRGSWHPHFEGG 
VKLGVGAPNLTLSMLPTRIUJLLEFVGFSGNKDYGLLQLEBGAS 
GHSFRSVI/:rVMLLLCYHTFLTFVLGTGNVNXEEAEKt.ljKPYIiNR 
YPKGAIFLFlAGRXEVIKGirXDAAIRRPEECCEAQQHWKQFHHM 
CYWEIiMWCFTYKGQWKMSyPYADt.LSKENCWSKATYXYMKAAYL 
SMFGKEDHKPFGDDEVELFRAVPGLKLKIAGKSIiPTEKFAIRKS 
RRYFSSNPISLPVPALEMMYIWNGYAVIGKQPKLTDGILEIXTX 
AEEKLEiCGPENEYSVDDECLViCLLKGLCI*m.GRVQEAEENm 
ISANEKKIKYDHYLIPNALLEIiALLLMEODRNEEAIKLIiBSAKQ 
NYKNYSMESRTHFRIQAATLQAKSSLENSSRSMVSSVSL 




S58€ 
5587 


2619 
1768 


915 
148 


IiPAGTPESSI.HJ£ALDQCMTALDLFLTNQFSEALSYI,KPRTKESM 
YHSLTYAriLEMQAMMTFDPQDlLtAGNMMKEAQMLCQRHRRKS 
SVTDSFSSLVWRPTLGQFTEEEXHAEVCYAKCLLQRAALTFLQD 
ENMVSFIKGGIKVRNSYQTYKELDSLVQSSQYCKGENHPHFEGG 
VKLGVGAFNLTLSMLPTRILRLLEFVGPSGMKDYGLLQLEEGAS 
GHSFRSVLCVMU.LCYHTFLTPVX/3TGimiIEEAEKLLKPyiJm 
YPKGAIFLFIAGRIEVIKGNIDAAIRRFEECCEAQQHWKQFHHM 
CYWELMWCFTYKGOWKMSyFYADLLSKENCWSKATYiyMKAAYI, 
SMPGKEDHKPFGDDEVELFRAVPGLKLKIAGKSLPTEKFAIRKS 
RR YFSSNP ISLPVPALEMMYI WNG YAVIGKQPKLTDGXLEI ITK 
AEEPttiEKGPENEYSVDDECLVKLLKGLCLKYLGRVQEAEENFRS 
ISANEKKIKYDHYLXPNAJXEIALLLMBODRWEEAIKLLESAKQ 
tJYKNYSMESRTHFRlQAATLQAKSSLENSSRSMVSSVSU Jf 
SSAVPI3GAVGkPVAVAVGGPPHSCRCRPCCIiMAAIGVHLGCTSA ' 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Pretiicted end 
nucleotide 
location 
corresponding 
Co first 
amino acid 
residue o£ 
amino acid 
sequence 


Amino acid segment containing signal peptide"" 
(A=Alanine, C=Cysteine^ D-Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G«=Glycine, 
H=Histidine, I=lsoleucine, K=Lysine, 
L=Leucine, M«Methionine, N^Asparagine, 
P=Proline, Qr=Glutamine, R=ArHinine, 
S==Serine, T»-Threonine, V« Valine. 
W=TrYptophan, Y=Tyro3ine, X=Unknown. *=Stoo 
Codon, /sspossible nucleotide deletion, " ' 
\-possible nucleotide insertion) 








CVAVYKDGRAGWANDAGDRVTPAWAySENEElvGLjUVKQSHl""~ 
RNISNTVMKVKQXLGRSSSDPQAQKYIAESKCI^VIEKNGKLR^E 
IDTGEETKFVKPEDVARLI FSKMKETAHSVLGSDANDWITVPf 
DFGEKQKNALGEAARAAGFWVLRLIHEPSAALIAVGlGQDSpiG 
KSN I LVFKLGGTS LSLS VMEVNSGI YRVLSTNTDDHIOSAHFTE 
TIAQ YLASEFORS FKttDVRGNARAMMKIiTNSAEVAKHSliSTLGS 
ANCFLDSLYEGQDFDCNVS RARPELLCS PLFNKCI EAIRGI*IiDQ 
NGFTADDINKVVL,CGGSSR I PKLQQLI KDIiPPAVEUJaS IPPDE 
VIPIGAAI EAGlIi IGKEKLLVEDSLMI ECSARDILVKGVDESGA 
SRFTVLPPSGTPLPARRQHTLQAPGSISSVCLELYESDGKNSAK 
EETKFAQVVLQDLDKKENGLRDIIAVLTMKRDGSiaVTCTDQET 
GKCEAISIEIAS 


S588 


3 


589 


TPPPPEQAMVAATVAAAWI,LLWAAACAQQEQDFYD?KAVNIRGK 
LVSLEKYRGS VSLWNVAS ECGFTDQHrRALQQI*QRDLGPHHFN 
VLAFPCNQFGQQEPDSNKEIESPARRTYSVSPPMFSKIAVTGTG 
AHPAFKYLAQTSGKEPTWNFWKyLVAPDGKWGAWDPTVSVEEV 
RPQITALVRKLIU,KREDL 


5589 


1884 


553 


LRQAWHEGGIGQTDKERGAAAIiPGEEGDPTRGRSLGRASWESGS 
PRRPRSPFSSFLPRPICLSLEARPCSIEDRRNWSlilGRPGAPAS 
GIjNRSSGLWLGPDRCRPRSRCSCRVMENPSPAAALGKftLCAX,I.L 
ATLGAAGQPLGGESICSARAPAKYSITFTGKWSQTAFPKQYPLF 
RPPAQWSSIiLGAAHSSDySMWRKNQYVSWGCJeDFAERGEAWALM 
KEI EAAGEALQS VHAVFSAPAVPSGTGQTSAELEVQRRHSliVS ? 
WRrVPS PDWFVGVDSLDLCDGDRWREQAALDLYPYDAGTDSGF 
TFSSPNFATlPQDTVTEITSSSPSHPANSFYyPRl.KAI,PPIARV 
TLLRLRQSPRAFIPPAPVLPSRDNEIVDSASVPETPLDCEVSLW 
SSWGLOGGHCGRLGTKSRTRYVRVQPAITNGSPCPELEEEABCVP 
DNCV 


5590 


72 


896 


LCSSGALRIjLPAMVAWRSAFLVCLAFSLATLVQRGSGDFDSl^ii;" 

EnAVKETSSVKQPWDHTTTTTTNRPGTTRAPAKPPGSGLJDLADA 

LDDQDDGRRK^IGGRERWmVTTTTKRPVTTRAPANTLGNDFD 

IxADALDDRNDREDGRRKPIAGGGGFSDKDLEDXVGGGEYKPDKG 

KGDGRYGSNDDPGSGMVAEPGTIAGVASALAMAIilGAVSSYISY 

QQKKPCFS IQQGLNADYVKGENLEAWCEEPQVKYSTIjHTQSAE 

PPPPPEPARI 


■ "S591 


68 


1494 


agssrraaasrllvsagcrsi*agrasg\rlllpxelit.pgeeeama 
lrvtrkskinaenkakinmagakrvptapaatskpglrprtalg 
dignkvseqlqakmpmkkeakpsatgkvidkklpkplekvpmlv 
pvpvsepvpepepepepepvkeeklspepilvdtaspspmbtsg 
capaesdla^afsdvxiiavkdvdaedgabpwlcsbyvjcdiyayii 
rqleeeqavrpkyllgrevtgwfrailidwlvqvqmkprllqet 

MYMTVSIIDRFMQNNCVPKKMLQLVGVTAMFIASKYEEMYPPEI 
GDFAFVTDNTYTKHQIRQMEMKILRAUfFGLGRPIiPLHFLRRAS 
KIGEVDVEQHTLAKYLMEL'mLDYDMVHFPPSQIAAGAFClALK 
IXJSNGEWTPrLQHYLSYTEESLLPVMQHLAKNAAMVNQGLTKHM 
TVKNKYATSKHAKXSTLPQLNSALVQDIAKAVAKV 


5592 


242 


924 


YGESFCDWNOKD^ifjSAT??/!^ — 

VGK5ALVVRFriTKRFIWSYDPTI,ESTYRHQATXDDEWSMEILD 
TAGQEDTIQREGHMRWGEGPVLVYDITDRGSFEEVliPLKNILDE 
I KKPKNVTL riiVGNKADLDHSROVSTEEGEKLATErACAFYECS 

ACTGEGNITEI PYELCREVRRRRMVQGKTRRRSSTTHVKQAIKK 
MLTKISS 


5593 


3 


1113 


HASGGRAANMAAERGAGQQQSQE^IMEVDRRVEs'EESGDEEGKKk■ ' 

SSGIVADLSEQSLKDGEERGEEDPEEEHELPVDMETINLDRDAE 

DVDLNHYRIGKlEGFEVLKICVKTLCLRQNLIKCIENIiEELQSLR 

BLDI.YDNQIKKIBNLEALTELEIX.DISFNLLRNrEGVDKLTRLK 

KLFLVWNKISKlENLSNLHQLQMLEIiGSNRIRAIENIDTL'mLK 

SJ:,FLGKNKITKI*QNl,DALTiVrLTVLSMQflWRI*TKrEGr.QNr,VNrLR 

EXiYLSHNGIEVIEGt^ENNNKLTMLDIASNRIKKIENISHLTELQ - 

EFWMNDNIiLESWSDLDEI.KGARSLETVYIiERNPIiQKDPQYRRKT 
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SEQ 
ID 
NO: 



S594 



S595 



5S97 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



1476 



698 



219 



731 



326 



^xno ac:La segment cbntaining signal peptia e- 
(A=:Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I-Isoleucine, K-Lysine 
L=Leucine, M=Methionine, N=Asparagine 
P-Proline. Q=Glutamine, R=Arginine, 
S»Scrinc, T= Threonine, V=.Valine 
W=Tryptophan, Y=Tyrosine, X=.Unkiown, *=.Stop. 
codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 



HLALPSVRQ IDATFVRF * 

HASGGtUU^MAASRGAGQQ QSQEMMEVDRRVESKKSGDEEGraCH " 

SSGIVADLSEQSLKDGEERGEEDPEEBHELPVDMETINLDKDAE 

DyDLNHyRIGKIEGFEVLKKVKTLCLHQNLIKCIEWLEELQSLR 

EliDLYDNQIKKrENLEALTELElLDISmiiLRNIEGVDKLTRLK 

KLFLVmKISKIENLSNmQIiQMLELGSNRIRAlENIDTLTNLE 

SLFIX5KNKITKLQNLDALTNLTVLSMQSNRLTKIEGLQNLVNLR 

ELYLSHNGIEVIEGLENNMKLTMLDIASNRIKKI2NISHLTEL0 

EFW^INDNIXESWSDI^ELKGARSLETVYI,ERNPLQKDPQyRRKV 

MLALPSVRQIDATFVRP 

ARWNGRWVQVPAWPGPGCGTN ASGERQRQLPRAWKPVGRTLGSE 
PIAnAWSPPLYLFPIPLPSWAVSQpfPTLGTMFADl^YDIEEDK 



* vrvi^wci vwv^KS iJOiKTRVEVAKMIQEVKGEVTIHY 

NKLQADPKQGMSU3IVLKKVKHRLVENMSSGTADALGLSRAILC 

NDGLViaiLEELERTAEI,YKGyrrEHTKNI*U?APyEl.SQTHRAFGD 

VFSVlGVREPQPAASEAFV.KFADAHRSrEKFGIRi.LKTIKPMLT 

DLNTYLNKAIPDTRLTIKKYLDVKFEYLSYCLKVKEMrJDEEYSC 

rALGEPLYRVSTGWYEYRLILRCRQEARARFSQWRKDVLEKMSL 

I^KHVQDIVFQLQRLVSTMSKYYimCYAVIiRDADVFPlEVDEA 

HTTLAYGLNQEEFTDGEEEEEEEDTAAGEPSRDTRGAAGPLDKG 
GSV3CDS 



2440 



GAVLAPSSI^PAAELAAQGESQSLBD LSNTSRPTSEVYKISFI^P 

ngdkyixsdctbJtssgiyerngigihttpngivytoswkddkmmg 

FORLEHFSGAVYEGOFKDNMFHGLGTyTPPNGAKrTGNFNBNRV 

kgegeythxqgtrmdwtfhftscsqt 

XSCKMAADGUSSX*PASWRS VTLTHVEYPAGULSGHLLAYLSl^P 
VWIVGF\rrLIIFKREI^TtSFI^IAI^GVNWLIKNVlQEPR 
PCGGPHTAVGTKyGMPSSHSQFMWFPSVYSFLPr.yI,R^5HQT^|NA 
RPlJDr,LWRHVLSLGLLAVAPI»VSySRVYLI>YHTWSOVLyGGIAG 
GLMAIAWPIFTQEVLTPLFPRIAAMPVSEFFtilRDTSLIPKVLW 
FEYTVTRAEAKNRQRKLGTKIiQ 

UJGFJAASFlycJCVA5LYI>'I.SPPPPSV SGVPYSPANSSWSCA]: ; 

VPLLGSGVPPHPPAPSPCCSGQTMLKMLSFKLI.I1IAVALGFFEG 

DAKFGBRWEGSGARRRRCLNGNPPKRLKRRDRRMMSQLELLSGG 

EMLCGGPypRI,SCCliRSDSPGLGRLEWKrFSVTNNTECGKLLEE 

IKCALCSPHSQSl,PHSPEREVLERDr.VLpr^KDyCKEFFyTCR 

GHIPGFLQTTADEPCPyyARKDGGIiCFPDPPRKQVRGPASNYLD 

QMBEYDKVBEXSRKHKttNCFCXQEWSGriRQPVGALHSGDGSQR 

IiPIIaEKEGyVKILTPEGEIFKEPyLDIHKLVQSGIKGGDERGLI, 

SLAFHPNYKKNGKI>YVSyTTNQERWAIGPHDHILRVVEYTVSRlC 

NPHQVDLRTARVFLEVAEUfRKHLGGQLLFGPDGFLYI ILGDGM 

ITLDDMEEMDGI^SDFTGSVLHLDVDTOMCNVPysIPRSNPHPWS 

TNQPPEVFAHGLHI>PGRCAVDRHPTDININLriI,CSDSNGKNRS 
SARILQIIKGKDYESEPc:T.T.PPTrD??c;MnDr.trr-/^»«r™«y 



* X ^'^^^x^xwtMri, 1 uyuijFVTKQWQEKPJCCWSTSGSCRGyPSG 

HlLGPGEDELGEVYlLSSSKSMTQIHNGKLYiaVDPKRPLMPEE 

CRATVQPAQTLTSECSRIX:RnGYCTPTG KCCCSPGWEGDFCRTC 

GXGPIAASFUrCKVASLYIFLSPPPPSV i^OVPYSPANgSWSCAi r 

VPLLGSGVPPHPPAPSPCCSGQTMIJCMI^FKLrj,IAVaLGFPEG 

DAKFGERNEGSGARRRRCLNGKPPKRt^KRRDRRMMSQLELLSGG 

EMLC::GGPYPRLSCCLRSDSPGI.GRl.ENKlFSVTNNTECGKraLEE 
I KCALCS PHS OS L FHS PETR RVT.RDor .vr.nT.T ^ 



*iw J- i-rtwnr I lAKKJJGGliCFPDFPRKQVRGPASNYLD 
QMEEYDKVEEISRKHKHNCFCIQEVVSGLRQPVGALHSGDGSQR 
LFILEKEGYVKILTPEGElFKEPyiiDIHKLVQSGlKGGDERGLL 
SLAFHPNyKKNGKLYVsyTTNQERWAIGPHDHILRWEYTVSRK 
NPHQVDLRTARVFLEVAELHRKHLGGQLLFGPDGFLYI ILGDGM 
I TLDDMEBMDGLSD c V r T .mrn'mMrTotrnvc? T M e 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide ' 
(A=Alanine, C-Cysteine, D«Aspartic Acid e- 
Glutamic Acid, F^Phenylalanine, G^lyciAe, 
H^Histidine, I^Isoleucine, K^Lysine, 
L=Leucine, M^Methionine, N«Asparagine , 

'^-'-'•^j-*^^t w=^j.uc amine, R*=Argniine, 
S=Serine, T==Threonine , V^Valine, 
W=Tryptophan, y=.Tyrosine, X^xUnknown, *=stop- 
Codon, /s^posslble nucleotide deletion, 
\==poss±jDle nucleotide insertion) 








&ARILQIIicuKUyESEPSLLEFKPFgNGPI.VGGFVYRGCQ3ERI, 

xv^o 1 V c ^L?««^w t ij-xlqqspvtkqwqbkplclgtsgscrgzfSg 

HILGFGEDELGEVYILSSSKSMTQTHNGKLYKIVDPKRPIMPEE 
CRATVQPAOTLTSECSRI,CRKQYCT?TGKCCCSPGWEGDPCRTR 


5600 


1977 


1244 


SI.RVLSGHLMQTRPLVQPDKPAtiPKr-lVTIiDGVPSPPGYMSDQE" 

ii-usn^t liGMKPVNQTAASNKGLRGLltHPQQLHLLSRQLEDPNGSF 

SNAEMSELSVAQKPEKLLERCKYWPACKNGDECAYHHPISPCKA 

FPNCKFAEKCLFVHPMCKYDAKCTKPDCPFTHVSRRIPVI.SPKP 

AVAPPAPPSSSQLCRYFPACKKMECPPyHPKHCRPNTQCTRPDC 

TFYHPTINVPPRHALKWIRPQTSE 


5601 
5602 


1977 


1244 


SLRVLSGHIiMQTRDIiVQ PDKPAS PKFI VTLDGVPS PPGYMSDQE 
EDMCFEGMKPVNQTAASNKGLRGLUHPQQLHLLSRQLEDPNGSF 
SNAEMSELSVAQKPEKLLERCKYWPACKNGDECAYHHPISPCKA 
FPNCKFAEKCLPVHPNCKYDAKCTKPDCPFTHVSRRIPVLSPKP 
AVAPPAPPSSSQLCRYFPACKKMECPPYHPKHCRFNTQCTRPDC 
TFYHPTINVPPRHALKWIRPQTSE 




246 


766 


YHTSCTVWRTAKETU^NTEVPVGCLMVYNNEWGKGRNEVNQTK 
NATRHAEMVAIDQVLDWCRQSGKSPSEVFEHTVLYVTVEPCIMC 
AAALRLMKIPLWYGCQNERFGGCGSVLNIASADLPNTGRPPOC 
IPGYRAEEAVEWLKTPYKQENPNAPKSKVRKKECQQILNMF 


5603 
5604 


1 


565 


FRGRTPISGUh:i?GCAQYPIPATPARSGENRTMPGAGDGGKAPAR 
WUSTGLLGLFLLPVTLSLEVSVGKATDIYAVNGTEILLPCTFSS 
CPGFBDLHFRWTYNSSnAFKIIilEGTVKNEKSDPKVTLKDDDRI 
TriVGSTKEKRtQKlS IVLRDLEFSDTGKYTdJVKNPKENNLQHHA 
TIFLQWDRRMQ 




1 


1506 


EDIFPAQJuLtuXJRHERVWQOEPPVRDHRSWGG^GAGGVAGREWT 

DQGQVALGGHYMAEGEGYFAMSEDEIACSPYIPLGGDFGGGDFG 

GGDFGGGDFGGGDFGGGGSPGGHCX.DYCESPTAHCNVLNWEQVQ 

RLDGILSETIPIHGRGNFPTLELQPSLIVKVA7RRRLAEKR1GVR 

DVRLNGSAASHVLHQDSGLGYKDLDLIPCADLRGEGEFQTVKDV 

VLDCLtiDFLPEGVNKEKITPLTLKEAYVQKMVKVCNOSDRWSIiI 

SLSNNSGKNVELKFVDSLRRQFEFSVDSFQIKLDSLLLFYECSE 

WPMTETFHPTIIGESVYGDFQEAFDHLCWKIIATRNPEEIRGGG 

IiLKYCNIiIiVRGFRPASDEIKTJUQRYMCSRFFIDFSDIGEQQRKL 

ESYLQNHFVGI:,EDRKYEYLMTmG^AWESTVCLMGHERRQTLNI» 

ITMUVIRVlADQWrVIPKVANVTCYYQPAPYVADANFSNYYIAQV 
OPVFTCOOOTYSTMT.Dntf 


5605 
5606 


35 


1821 


SQRSCPRSPSSPAPPWARCyNPUSRTGGVPVPRAWSAGGPALGL 

MAAPVRLGRKRPLPACPNPLFVRWLTEWRDEATRSRHRTRFVFQ 

KALRSLRRYPLPLRSGKEAKILQHFGrJGLCRMLDERiQRHRTSG 

GDHAPDSPSGENSPAPQGRLAEVQDSSMPVPAQPKAGGSGSYWP ' 

ARHSGARVILLVLYREHLNPNGHHFLTKEELLQRCAQKSPRVAP 

GSARPWPALRSLLHRNLVLRTHQPARYSLTPEGrjEIAQKIAESE 

GLSLLNVGIGPKEPPGEETAVPGAASAELASEAGVQQQPItELRP 

GEYRVlXCVDIGETRGGGHRPELLREIiQRLHVTHTVRKLHVGDF 

Wn^AOETWPRDPANPGELVItDHIVERKRUODLCSSIIDGRFREQ 

KPRLXRCGIiERRVyLVEEHGSVHNLSLPESTLLQAVTNTQVIDG 

FFVKRTADIKESAAYLAIiLTRGLQRLYQGHTLRSRPWGTPGNPE 

SGAT^SPNPLCSLLTFSDFNAGAIKNKAQSVREVFARQLMQVRG 

VSGEKAAALVDRYSTPASLLAAYDACATPKBQETLLSTIKCGRI. 

QRNLGPALSRTLSQLYCSYGPLT 




3 


1099 

: 


GRSRC?GPQARGGTMSPRSCLRSLRLLVFAVFSAAASNWI.YIiAK 
LSSVGS ISEEETCEKTiKGLIQRQVQMCKRNLEVMDS VRRGAQIA 
lEECQYQFRNRRWNCSTLDSIiPVFGKWTQGTREAAFVYAISSA 
C5VAPAVTRACSSGELEKCGCDRTVHGVSPQGPQWSGCSDNIAYG 
VAFSQSFVDVRERSKGASSSRALMWLHNNEAGRKAXLTHMRVEC 
KCHGVSGSCEVKTCWRAVPPFRQVGHAIiKEKFDGATEVEPRRVG 
5SRALVPRNAQFKPHTDEDLVYLEPSPDFCEQDMRSGVLGTRGR . 
rCNKTSKAIDGCBLDCCX3RGPHTAQVELAERCSCKFm*CCFVl4: 
^QCQRLVELHTCR 
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SEQ 
ID 
WO: 


Predicted 
begrinning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acid segment contaxning signal peptide 
(A=Alanine, C^Cysteine, D=Aspartic Acid, 
Glutamic Acid, F-Phenylalanine, G-Glycine, 
H^Histidine, I-Isoleucine, K=*Lysine, 
L= Leucine ^ MaMe thionine , N-Asparagine , 
P=Proline, Q»Glutamine, R«Arginine, 
SeSerine, T=Threonxne, V==Valine, 
V/*=Tryptophan, Y=Tyrosine, X=iUnfcnown, *i=Stop 
Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion) 




5607 


521 


141 


PPVCNPAl5AMPSPGTVCSLLLLC4Ivir,WLDLAMA(^iiSFI«SPEHQRV " 

QQRKESKKPPAKLQPRALAGWLRPEDGGQAEGAEDELEVjmOvP 

FDVGIKLSGVQYQQHSQALGKFLODILWEEAKETIPADK 


5608 


2 


983 


WFQSPLRQADPGPPRHTLFMDFVAGAIGGVCGDAVGYPIiDTVlCV 
RIQTEPKYTGXWHCVRDTYHRBRVWGFYRGIiLLPVCTVSIjVSSE 
VFGTYRHCLAHlCRLRFGNPDAKPTKADITLSGCASGLVRVFIiT 
SPTEVAKVRLQTQTQAQKQQRRLSASGPLAVPPMCPVPPACPEP 
KyRGPriHCLATVAREEGLCGLYKGSSALVt.RDGHSFATyPLSYA 
VLCEMLSPAGHSRPDVPGVLVAGGCAGVIAWAVATPMDVIKSRL 
QADGQGQRRYRGLLHCMVTIVREEGPRVLFKGLVLNCCRAPPVN 
MWFVAYEAVLRLARGLLT 




5609 


1628 


304 


AKGVWVLPSPPPRPGRGALVSGSGliRRGRSGTSWRPRRMNHKSK 

KRIREAKRSARPELKDSLDWTRHNYYESFSIiSPAAVADNVBRAD 

ALQLSVEEFVERYERPYKPVVLLNAQEGWSAQEKWTLERLKRKy 

RNQKFKCGEDWDGYSVKMKMKYYIEYMESTRDDSPLYIFDSSYG 

EHPKRRKLLEDYKVPKPFTDDIiFQYAGEKRRPPYRWPVMGPPRS 

GTGIHIDPLGTSAWNALVQGHKRWCLPPTSTPRELIKVTRDEGG 

NOQDEAITWFNVIYPRTQLPTWPPBFXPLEILQKPGETVPVPGG 

Wmi\nnUtnUDTTIAlTQNPASSTNFPWWHKTVRGRPKLSRKWYR 

ILKQEHPELAVIiADSVDLQESTGIASDSSSDSSSSSSSSSSDSD 

SECESGSEGDGTVHRRKKRRTCSMVGNGDTTSQDDCVSKERSSS 
R 




5610 




1196 


LERTPA3ADMAWTKYQLFLAGLMLVTGSINTI»SAKWADNFMAEG 
CGGSKEHSFQHPFLQAVGMFLGEFSCLAAFYI,I,RCRAAGQSDSS 
VDPQQPFNPLLFLPPALCDMrGTSLMYVALNMTSASSFQMLRGA 
VIIFTGIJPSVAFLGRRLVLSQWLGIX4ATIAGI,VWGtADI,I,SKH 
DSQHiaiSEVITGDIiLIIMAQIIVAIQim.EEKFVYKHNVHPliRA 
VGTEGLr'GFVILSI.riLVPMYYIPAGSFSGNPRGTliEDALDAPCQ 
VGQQPLIAVALIiGNISSIAFFWFAGISVTKELSATTRMVLDSLR 
TWIWALSLALGWEJVFHALQrLGFLILLIGTALYNGIilRPliLGR 
LSRGRPLAEESEQEEtiLGGTRTPINDAS 


5611 


2 


577 


FVLPNRI,GIPG£iT^-RGPGACASSSSLAASAKPGAGGSPAIiAMSG 
ELSNRFQGGKAFGIiLKARQERRLAEINREFIiCDQKYSDEEINltPE 
KLTAFK^KYMEFDbNNEGEIDLMSLKRI^MEKl^VPKTHLEMKKM 
ISEVTGGVSDTISYRDFVWMMLGKRSAVIiKLVMMFEGKANESSP 
KPVGPPPERDIJ\SLP 


5612 


1 


721 


ASKDGYMDATlAPHRIPPEMPQYGEENHIFEUflC>AMWI«Ci!I£IJiS 
SLLTItEKfLI LNEPS YTATEARRLYLQRKTVPSALLVQLIQERLA 
EEDCI KQGWILDGI PETREQALRIQTLG ITPRHVIVLSAPDTVL 
IBRNLGKRIDPQTGEIYHTTFDWPPESEIQNRLMVPEDISELET 
AQKLLEYHRNIVRVIPSYPKIl^KVISADQPCVDVFYQALTyVQS 
KHRTNAPFTPRVLr,I/3PVGS 


5613 


115 


1279 


RGVDPALRRAEKMLPLSIKDDEYKPPKFNLFGKISGWFRSILSD 
KTSRKLPFFLCLNLSPAFVBLLYGIWSNCLGLISDSFHMFPDST 
AirAGIiAASVISKWRDfJDAFSYGYVRAEVLAGFVNGLPLIPTAF 
FIFSEGVERAIJVPPDVHHERLLLVSlLGFVm^lGIFVTKHGGH 
GHSHGSGHGHSHSLFNGALDQAHGHVDHCHSHEVKHGAAHSHDH 
AHGHGHFHSHDGPSLKETTGPSRQli^VFLmiADTIiGSIGVX 
ASAIKMQNFGLMIADPICSILIAILIWSVIPLLRESVGILMQR 
TPPLLBWSLPQCYQRVQQLQGVYStiQEQHFWTLCSDVYVGTLKL 
IVAPDADARWILSQTHNIFTQAGVRQLYVQIDFAAM 


S614 


3 


1268 

( 


LLSRNEHACPU2AGLGLTQRKPKAIRGREGRATNQGQGBiX3NER 
APWGARQRLGVMAELQQLQEFEIPTGRBALRGNHSALUiVADyC 
EDNYVQATDKRKALEETMAFTTQAIASVAYQVGNIAGHTLRMLD 
CK?GAAIJiQVEARVSTLGQtm?MHMEKWUUlEIGTIiATVQRI.PPG 
QKVIAPENLPPriTPYCRRPLNFGCLDDIGHGIKDIiSTQLSRTGT 
LSRKSIKAPATPASATLGRPPRIPEPVHIiPWPDGRLSAASSAS 
SLASAGSAEGVGGAPTPKGQAAPPAPPLPSSLDPPPPPT^VEVF 
2RPPTLEELSPPPPDEEI,PLPLDLPPPPPLDGDErx;i»PPPPPGE 
3PDEPSWVPAS YLEKWTLYPYTSQKDKELS FSEGTVI CVTRR? 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

c o r re apond i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Ammo acid segment containing signal peptide 
(A=Alanxne, C=Cysteine, D^Aspartic Acid, E- 
Glutamic Acid, P=PhenyIalanine, G=Glycine, 
H*-Histidine, I-Isoleucine, K^Lyslne, 
L=Leucine, M=«ethionine, N=Asparagine, 
P=.Proline, Q-Glutaraine, R^Arginine, 
S^Serine, T=Threonine, V«Valine, 
W.Tryptophan, Y^Tyrosine, X-0n3cnown, *=Stop. 
*-oaon, /-possioie nucleotide deletion, 
\»possible nucleotide insertion) 
SDGWCEGVSSEGTGFPPGNYVEPSC " ~" 


5615 


9 


1558 


ALGRRRPGUFHHMEAAATPAAAGAARRKELDMDVMRPLINBQWF • 

DGTSDEEHEQELLPVQKHyQLDDQEGlSFVQTIWHLLKG>IIGTG 

LLGLPIJ\IKNAGIVLGPISLVFIGIISVHCMHIt,VRCSHFLCLR 

FKKSTLGYS DTVS FAME VS PWSCLQKQAAWGRSWDFFIjVITQL 

GFCSVYIVFLAENVKQVHEGFLESKVPISMSTN3SNPCERRSVD 

LRIYMLCFLPFlIIiLVFIRELKKLFVLSFLANVSMAVSLVliyQ 

YWRNKPDPHNLPIVAGmCKyPLFFGTAVFAFEGIGWLPLENQ 

MKESKRFPQALNIGMG I VTTLYVTLATIiGYMCFHDE HCGSITLN 

I»PQDVWr.YQSVKII>YSFGIFVTYSIQPYVPAEIIIPGITSKFHT 

KMKQICBFGIRSFLVSITCAGAILIPRLDIVISFVGAVSSSTIA 

LIL?PLVEXLTFSKEHYNIWMVLKNISlAFTGWGPI.IiGTYITV 

EEIIYPTPKWAGTPQSPFLNLNSTCLTSGLK 


5616 


1 


719 


DDFVRCXSPUbAAMGASARLLRAVIMGAPGSGKGTVSSRITTHFE 
LKHLSSGDljIiRDNMLRGTEIGVrAKAFIDQGKLIPDDVMTRLAL 
HELKNI^TQYSWLIjDGFPRTLPQAEALDRAYQIDTVINLNVPFEV 
IKQRLTARWIHPASGRVYNIEFWPPKTVGIDDLTGEPLIQREDD 
KPETVIKRI^KAYEDQTKPVLEYYQKKGVLETPSGTETNKIWPyV 

yaflqtkvpqrsqkasvtp 


5617 


176 


765 


PWRGRGSRPRGAGAMAEEQVNHSAGIAPDCEASATAETtvssVG"' 

tceaagkspep:<dydstcvfcriagrqdpgtellhcenedi,icf 

KDIKPAATHHYIiVVPKKHIGNCRTLRKDQVELVENMVTVGKTIL 

ERMNFTDFTimy^GFHMPPPCSISHLHLHVLAPVDQLGFLSKLV 
YRVNSYWFITADHIilEKLRT 


5618 


3 


1692 


YIJTYitJt,KSEMiaSCKEDLWEKLQYLWKSTtMLPEDLLRVPDES 

LPLNSGGDSLKSlRLLSEIBKLVGTSVPGLIiEIILSSSILEIYN 

HILQTWPBEDVTFRKSCATKRKLSNINQEEASGTSLHQKAIMT 

FTCHNEINAFWLSRGSQILSLNSTRFLTKLGHC5SACPSDSVS 

QTNIQNLKGLNSPVIUIGKSKDPSCVAKVSEEGKPAiaTQKMELH ^ 

VRWRSDTGKCVDASPLWIPTFDKSSTTVYIGSHSHRMKAVDFY 

SGKVKWSQILGDRIESSACVSKCGNFIWGCYNGLVYVLKSNSG 

EKYWMFTTEDAVKSSATMDPTTGLIYIGSHDQHAYALDIYRKKC 

VWKSKCGGTVFSSPCLNLIPHHLYPATLGGLLLAVNPATGNVXM 

KHSCGKPLFSSPQCCSOYICIGCVDGNIJ^CPTHFGEQVWQFSTS 

GPlFSSPCTSPSEQKIFFGSHDCFlYCCNMKGHliQMKFETTSRV 

yATPPAFHmrNGSNEMl*IjyVASTrX3KVWILESQSGQLQSVYELP 

GEVFSS P WLESMLI IGCRDNYVYCLDLLGGNQK 


5619 


2160 


1477 


DSPVLPTSGNVISTAQPAQPWSAVEAAliRSLGSPPGAiiRGCPCP 

AQSI»HSHQLAAWDPLKPSLRSYPPHr,LQHPQLRSLTASSGHL6R 

RSCPQPRPLEELLRAGSSTRPQPLTSSCCGMSCMYSFLGHCSVr. 

LWGTKGRGSGSPSSPGCCLHPPAQHSQDLPLVHVDVGWQPPLGP 
TVGLRPGIjlliGEr?OPr:aT.pafTinonrv^r»'DT tin»mmr»T%-*- - 
*. *'^•"i^■c■\JrJJ^JOli*\,v^K^AlJi«l^JlJt'y^Jy^^i;'JLijt'Ai^^ED^^GVPS 

ECSPPATP 


5620 


530 


1B2 


PLPPPTiJ^FI,TRSEYDRGVNTFSPEGRLFQVEYAIEAlkU3ST * 

AIGIQTSEGVCIAVEKRITSPLMEPSSIEKIVEIDAHIGCAMSG 

LIADAKTLIDKARVETQNHWFTYNETMTVESVTQAVSWLALQFG 

EEDADPGAMSRPFGVAIaLFGGVDEKGPQLPHMDPSGTFVQCDAR 

AIGSASEGAQSSLQEVYHKSMTLKEAIKSSLIILKQVMEEKLNA 

TNIELATVQPGQNFHMFTKEELEEVlkDI 


5621 


3 


619 


VVEFVByTAIT?ANVKNESLSSVQQtX3IKMiVRYGKFLSIJC.KDGA 
ENDLTWVLKHCERFLKQQQTSIKSSLLCLQGNYAGHDWFVSSLP 
MIMLGDKEKTFQFLHQFSRX/LTSAFLWLPRLHISSYLPNDTVES 
GIHPVYFCSTHYIEMLLKAELPLVFSAFHMSGFAPSQICLQWIT 
QCFWWYLDWXEICHYrATCVFLGPDYQVYICIAVFKHLQQDILQ 

HTQTQDLQVFLKEEALHGFRVSDYPEYMEIIiEQNYRTVLLRDMR 
NIRLQST 


S622 


1122 


456 


AASTKDAVSRKRSHSASEKSGTGTSlSKRLNmPQIRNPMKAMY 
PGTFYFQFKNIiWEANDRNETWLCFTVEGIKRRSVVSWlcrGVFRN 
QVDSETHCHAERCPI.S W FCDDI l>SPNTKYQVTWyTS WSPCPDCA 
GEVAEFLARHSNVNXiTI FTARLY YFOYPCYQEGLRSI^SQEGVAV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first: 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleobide 

location 

CO rre spondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Amino acid segment containing signal peptide ' 
(A«Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G-Glycine, 
H-Histidine, I^Isoleuclne, K=ljysine, 
Lsr Leucine, M=Methlonine^ N=tAsparagine , 
P=Proline, Q=^lutamine, R-Arginine, 

Serine, T«Threonine, V=Valine, 
W=Tryptophan, Y-Tyroaine, X.=.Unknown, *=Stopt 
Codon, /»^posaible nucleotide deletion^ 
\°possible nucleotide insertion) 










EIMDYEDFKYCWENFVYNDNEPFKPWKGLKTNFRULKRRLRESL 
Q 




5623 


3 


9S4 


FLPFFIRAPKISRNGQWLFTFXTPFPFANKALPGWEGiVPACFW 
RKKrLTPSTGTMELLQVTXLFLLPSICSSNSTGVLEAANNSLW 
TTTKPSITTPNTESLQKNVVTPTTGTTPKGTITMBr*LKMSLMST 
ATFLTSKDBGLKATTTDVRKNDS 1 1 SNVTVTS VTLPNAVSTU3S 
SKPKTETQSSI KTTEI PGSVLQPDASPSKTGTLTSIPVTIPENT 
SQSQVIGTEGGKNASTSATSRSySSIILPWIAI.IVITLSVFVL 
VGLYRMCWKADPGTPENGNDQPQSDKESVKLLTVKTISHESGEH 
SAQGKTKN 




5S24 


1S9 


898 


PGVAAAAGALPQYHGPAPALVSCRRELSLSAGSLQLERKRRDFT 
SSGSRKLYPDTHALVCriLEDNQFATQOAEIIVSALVKILEANMD 
IVYKDMVTKMCK?EITFCX3VMSQIANVKKD«IILEXSEFSAI/RAE 
KEKIKLEIiHQr*KQQVMDEVIKVRTDTKI.DFm.EKSRVKELYSUT 
EKKLLELRTEIVALHAQQDRAIiTQTDRKIETEVAGLKTMLESHK 
rJDNIKYLAGS I FTCLTVALGFYRLW 1 




562S 


1 


i:.80 


TIPSSAAAQRAGPPAGALBALSPGGARAHAERRGEMRATPffiA 

AGSLSRXKRLELDDNLDTERPVQKRARSGPQPRLPPCIiI,PLSPP 

TAPDRATAVATASRLGPYVLLEPEEQGRAYQALHCPTGTBYTCR 

WPVQEAlAVIjEPYARtjPPHKHVARPTEVLAGTQIiljYTVFFTRTH 

GDMHSLVRSRHRIPBPEAAVbFRQMATALAHCHQHGLVIJiDLKL 

CRFVPADRERKKLVLENLEDSCVIiTGPDDSLWDKHACPAYVGPE 

ILSSRASYSGKAADVWSIiGVAIirrMIiAGHYPFQDSEPVUUFGKI 

RRGAYALPAGLSAPARCLVRCLLRREPAERLTATGILLHPWLRQ 

DPMPLAPTRSHLWEAAQWPDGLGLDEAREEEGDREWIjyG 


5 


5626 


3123 


2oii 


i'fKAI^SVAMENQVLTPHVYWAQRHRELYLRVELSDVQNPAI S 1 
TBNVLHFKAQGHGAKGDNVYEFHLEPLDLVKPEPVYKXiTQRQVN 
ITVQKKVSQWWBRLTKQEKRPLFXiAPDFDRWliDESDAEMELRAK 
EEBRr»NKI>RIiESEGS PBTLTNXjRKGYLFMYHLVQFLGFSWI FVN 
LWRFCI LGKESFYDTFHTVADMMYFCQMIAWETINAAIGVTT 
SPVLPSLIQLLGRNFIL»PlIFGTMEEMQNKAVVPFVFYriWSAIE 
IFRYSFYMLTCIDrroWKVLTVJLRYTLWlPLYPXiGCLAEAVSVIQ 
S IPIFNETGRFS PTLPypVKIKVRFSPFI^ I YL IMI FI^LYINF 
RHLYKQRRRRYGQiOCKKTH 




5627 


3X23 


20X1 


PPRALGSVAMEN-QVLTPHVYWAQRHRELYIiRVELSDVQNPAIS I" 
TENVLHFKAQGHGAKGDNVYEFHLEFIiDLVKPSPVYIObTQRQVN 
ITVQKKVSQWWERI.TKQEKRPLFLAPDFDRWLDESDAEMELRAK 
EEBRIiNKLRLESEGSPETLTNLRKGYLFMYWLVQPIiGFSWIFVN 
LWRFCI I^KESFYDTFHTVADMMYFCQMLAVVETINTUViaVTT 
SPVLPSLXQLLGRWFILFIIFGTMEEMQNKAWFFVFYLWSAIE 
IFRYSFYMLTCIDMDWKVLTWLRYTLMI PLYPLGCLAEAVSVIQ 

SI P IFNETGRPSPTItP YP VKI K VRFSFPLQI YLIMIPIiGLYIKF 
RHIiYKQRRRRYGOKKKKIH 




5628 


75 


1455 


VAGAMASKCuKAGFSSGSLKS PGGASGGSTRVSAM YSSS PCKLP"" 
SLSPVARSFSACSVGIiGRSSYRATSCLPALCLPAGGPATSYSGG 
GGWFGEGILTGNEKETMQSLNDRLAGYLEKVRQLEQENASLESR 
J. Kr. w t-iiyy V f X WL.i'jjy y SZFRT I EEttQKKTLCS KAENARXjWE I 

DNAKLAADDFRTKYETEVSLRQLVESDINGLRRIUDDltTLCKSD 

LEAQVESLKEELLCLKKNHEEEVNSLRCQI.GDRLNVEVDAAPPV 

DliNRVItEEMRCQYETLVENNRRDAEDWJUDTQSEELNQQWSSSE 

QliQSCQAEI lELRRTVNALEIELQAQHSMRDALESTLAETEARY 

SSQIiAQKQCMITWVEAQIiAEIRADLERQNQEYQVIJL,DVRARLEC i 

EINTYRGLLESEDSKLPCMPCAPDYSPSKSCLPCLPAASCGPSA 

ARTMCSARPICVPCPGGRF 




5629 


^287 


938 

^ ^ \. 


GRPRSSSDNRNFLRERAGIiSSAAVQTRIGNSAASRRSPAARPPV 
PAPPALPRGRPGTEGSTSIiSAPAVIiWAVAVWWVSAVAWAMA 
NYIHVPPGSPEVPKLNVTVQDQEEHRCREGALSLLQHLRPHWDP 
QEVTIX?LFTDGXTNKLIGCYVGim3EDVVI.VRIYGNKTEr.LVDR 
DEEVKSFRVLQAHGCAPQLYCTFNNGLCYEFIQGEALDPKHVCN 
PAIPRLIARQIAXIHAIKAHWGWIPKSWI.WLKMGKYFSrilPTGF'^ 
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SEQ 
XD 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
CO r r e spond i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ami.no acia segment containing signal peptide" 
iA=Alanine, C-Cysteine, D^Aspartic Acid E- 
Glutamac Acid, Phenylalanine, G=.Glycine, 
H^Histidine, l=«Isoleucine, K=Lysine, 
L^Leucine, M=Methionine , M=Asparagine , 
P=Proline, Q=Glut amine, R«Arginine, 
S=iSerine, T»Threonine, V=Valine, 
W=:Tryptophan, Y=Tyrosine, X=Unknown, *=SCop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








ADEDIiNKkFI^DIPSSQiUjE£MTMMkljlIiSMXXSSPV\a^CHNDL~ 

LCKWIIYNEKQGDVQFIDyEYSGYNYIAYDlGNHFNEFAGV^V 

DYSLYPDRELQSQWLRAYLEAYKEFKGFGTEVTEKEVEII.PIOV 

NQFALASHPFWGLWALIQAKYSTIEFDFLGYAIVRFNQYFKMKP 
EVTALKVPB 


S630 


1194 


278 


GFWAIAQTCAHHIiPPGSFWLVPASPWRLPEMSSFGYRTLTVALF 
TLICCPGSDEKVFEVHVRPKKLAVEPKGSLEVNCSTTCNQPEVG 
GLETSLDKILLDEQAQWKHYLVSNISHDTVLQCHPTCSGKQESH 
KSNVSVYQPPROVII.TLQPTLVAVGKSPTIECRVPTVEPLDSLT 
IiFLFRGNETLHYErPGKAAPAPQKATATFNSTADRBDGHRNFSC 
LAVIJ)Lf^RGGNIFHKHSAPFCMLEIYEPVSDSQMVIIVTVVSVI, 
LSLFVTSVLLCFIFGQHUIQQRMGTYGVRAAWRRLPQAFRP 


SS3X 


1053 


290 


SRVDDF VRPE PSRAEP5 RSGRRRPAkKAATM^ VFGKt;^GiAGGGJr~ 

AGKGGPTPQEAIQRLRDTEEMI^KKQEFLEKKIEQELTAAKKttG 

TKNKRAAIKJALKRKKRYEKQLAQIDGTliSTlKFQREALENANTN 

TE\a,KNMGYAAKAMKAAHDNMDIDKVDBLMQI>IADQQELAEEIS 

TAISKPVGFGEEFDEDELMAEI^EELEQEEIiDKNLI^ISGPETVP 

LPNVPS lALPSKPAKKKEEEDDDMKEIiENWAGSM 


5€32 


3 


952 


WJX?WSPPRRLWWGSi:y3AAQRPAVPVSGIARSLHVETRRPHRRA ' 
SVRVARGRLGVWAQPQPLLPRPVGSRREMQPPGPPPAYAPTNGD 
FTFVSSADAEDLSGS I ASPDVKLNLGGDFl KESTATTFLRQRGY 

GWLLEVEDDDPED^^CPLLEELDIDLKDrYYKlRCVLMPMPSLGF 

NRQWRDNPDFWGPLAWLPFSMISIiYGOFRWSWIITIWIFGS 

r*TIFLIARVLGGEVAYGQVLGVIGYSI.LPI,IVlAPVLLWGSPE 

WSTLIKLFGVFWAAYSAASLLVGEEFKTKKPLLIYPIFLLYIY 
FLSLYTGV 


5633 


771 


460 


QGCSKTMSVGRPFYRSSEFMEQIiLSSHIjHQVPFFCCPrWCLCN 

CLFBNSVSKLYMLCFNFFMSIFFYSLSITKt,NLIYLWGLSYQSL 
Lt*LLLSGHRPWGSSMV 


5634 


1446 


855 


PRATGRXRSRAiyVSRPRAGAGASGAEPRSGRERSRLSGRRAPAM 
ARNTLSSRFRRVDIDEFDENKPVDEQEEAAAAAAEPGPDPSEVD 
GliRQGDMLRAFHAALRNSPVNTKWQAVKERAQGVVIiKVliTNFK 
SSEIEQAVQSliDRNGVBX^LMKYIYKGFEKPTENSSAVLIiQWHEK 
ALAVGGLGSlIRVIiTARKTV 


5635 " 


3 


• 943 


DRGPRSTATOTGRARVSFWRPPLDPGVKNSifVQISGEiatRFkTX*" 

RSLFHPFPVTRSGAPRAVLVGSSWPAKMVAPAVKVARGWSGIAI, 

GVRRAVLQLPGLTQVRWSRYSPEFKDPLIDKEYYRKPVEELTEE 

EKYVRELKKTQLIKAAPAGKTSSVFEDPVISK^FTNMMMIGGNKV 

lARSLMIQTr^EAVKRKQFEKYHAASAEEQATIERNPYTIFHQAl, 

KNCEPMIGLVPILKGGRFYQVPVPLPDRRRRPIAMKWMITECRD 

KIOlQRTLMPEKt^HKLLEAFHNQGPVIKRKHDLHKMAEANRAIA 
HYRWW 


5636 
5637 


2253 


1143 


LEDTICQHPPAEKKXiYLYHiRKLREVERKGIPRLPKDVFMDTHQG" 

LTDVEAKVTGFSEGWDSVKGGPSSPSQATHSAAGAWSKPRBI 

ASLIRNKFGSADNIPNLKDSrjEEGQVDDAGKALGVISNFQSSpK 

YGSEEDCSSATSGSVGANSTTGGIAVGASSSKTNTLDMQSSGPD 
ALLHEI0EIRETOARIjEESt?P"rT vptrvotsnvt'T TMi^rnr r^cm5-»*« 

CERLEEQLm)LTEIiHQNEILNl.KQELASMEEKIAYQSYERARDl 
QEAIiBACQTklSKMELQQQQQQWQriEGLEMATARNLI.GKi;.IWI 

llavmavllvfvstvancwplmktrnrtfstlflwfiaflwk 

HWDALFSYVERFFSSPR 




S48 


2532 


MSFCGARANAKMMAAYNGGTSAAAAGHHHHHHHHLPHbPPPHLH 
HHHHPQHHUHPGSAAAVHPVQQHTSSAAAAAAAAAAAAAMLWPG 
QQQPYFPSPAPGQAPGPAAAAPAQVQAAAAATVKAHHHQHSHHP 
QQQLDIEPDRPrGYGAFGWWSVTDPRDGKRVALKKMPNVFQNL 
VSCKRVFREt^KMLCFFKHDNVLSJlLDiriQPPHIDYPEElyWTB 
CiMQSDLHKIIVSPQPLSSDHVKVFLYQILRGLKYLHSAGILHRD 
r KPGNLLVNSNC VLKICX) FGLARVEELDESRHMTQEWTQYYRA 
PE.iLI>IGSRHVSNAIDIWSVGCIFAELr,GRRILFQAQSPlQQr,D5f 
ITDLLGTPSLEAMRTACEGAICAHILRGPHKQPSLPVLYTLSSQA 
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wo 01/53312 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
acnxno acxd 
sequence 


Amino acid segment containing signal peptide 
<A*;Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F:=Phenylalanine, Gs=Glycine, 
H=Histidine, I^Isoleucine, Keljysine, 
L=Leucine, M^-Methionine^ N«Asparagine , 
P=Proline, Q=Glut amine, R^Arginine, 
S-Serine, TsThreonine, V=Valine, 
W-Tryptophan, y=Tyrosiue, X-Unknown, *-stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








THEAVHIiLCRMLVFDPYKRISAKDAiAHPYIiDEGRr,RyHTCMCK " 
CCFSTSTGRVYTSDFEPVTNPKPDDTFEKNLSSVRQVKE I IHQF 
ILEQQKGNRVPLCINPQSAAFKSPISSTVAQPSEMPPSPLVWE 


5638 


125 


1155 


DRKMSELDQLRQEAEQLKNQIRD7VRKACA0ATLSQXTNWIDPVG " 
RIQMRTRRTItRGHIiAKIYAMHWGTDSRLLVSASQDGKL 1 1 WDSY 
TTNKVHAI PLRgS WVMTCAYAPSGWYVACGGLDNICS I YNLKTR 
EGNVRVSRELAGttTGYLSCCRFLDDNQlVTSSGDTTCALMDIET 
GQQTTTFTOHTGDVMSLSLAPDTRLFVSGACDASAKLWDVREGM 
CRQTFTGHESDIWAICFFPNGNAFATGSDDATCRLFDLRADQEL 
KTYSHDNIICGITSVSFSKSGRLLLAGYBDFNCNVWEtAIiKADRA 
GVIAGHDNRVSCLGVTDDGMAVATGSWDSFLKIWN 


5639 


125. 


1155 


DRKMSElMlLKQEAEQI^QrREJURKACMATliSQITWNlDPVG ' 

RIQMRTRRTLRGHLAKIYAMHWGTDSRLLVSASQDGKLIIWDSY 

TTNKVHAIPLRSSWVMTCAYAPSGNYVACGGLDNICSIYNLKTR 

EGNVRVSRELAGHTGYLSCCRFLDDNQIVTSSGDTTCALWDIET 

GQQTTTFTGHTGDVMSLSIiAPDTRLPVSGACDASAKLWDVREGM 

CRQTPTGHESDINAICFFPKGNAPATGSDDATCRLFDLRADQBL 

MTYSHDNI ICGITS VSFS KSGRLIiliAGYDDPNCNVWDALKADttA 

GVIiAGHDNRVSCLGVTDDGMAVATGSWDSFLKIWM 


5640 

f 


280 


1092 


qqgnkktmlshntmmkqrKqqataimk^ivhgndvdgmdlgkicvs"" 

IPRDrMI^EELSHliSNRGARLFKMRQRRSDKYTPENFQYQSRAQI 
KfHSIAMQNGKVDGSNLEGGSQQAPLTPPNTPDPRSPPNP'DNIAP 
GYSGPLKEI PPEKPNTTAVPKYYQSPWEQAl SKDPEliLBALYPK 
IjFKPEGKAELPDYRSFNRVATPFGGFEKASRMVKFKVPDFELI»I, 
LTDPRFMSFVNPLSGRRSFNRTPKGWlSEaJilPIVITTEPTDDTT 
VPESEDL 


564X 


27 


332 


o^hncwgdvkllsnqmdklfafhlftfhgllhfldgsiokliqa 

EII LSDMSSIIiVLENNFLFKVKSKQFIHLI AKKFYIS ITI VSAS 
NGESPVLSMIVTG 


S642 


199 


1247 


ITPCRMDFLVIiFLPYLASVLMGLVLICVCSKTHSLKGI*ARGGAQ 
IFS CI IPECLQRAMHGI»I,HYLFHTRNHT FI VliHLVLQGMVYTE Y 
TWEVFGYCQEIiELSLHYIiHiP YLLLGVNI.FPFTLTCX3TNPG I IT 
KANELLFLFIVYEPDEVMFPKNVRCSTCDliRKPARSlCHCSVCNWC 
VHRFDHHCVWVNNCIGAWNIRYPLIYVLTIiTASAATVAIVSTTP 
LVHLWMSDLYQETYIDDLGHLHVMDTVFLIQYLFLTFPRIVFM 
I^FVVVLSPUiGGYLLFVI.YIAAlWQTTNEWYRGDWAWCX3R 
VAWPPSAEPQVHRNIHSHGIiRSNLQBIFIiPAPPCHERKKQB 


5643 


1 


847 


PSGGVRDVETRGPGSRAARGPRWMHRRGVGAGAIAKKIOAEAK 
YKERGT\n:iAEIX5IAQMSKQLDMFKTNLEEFASKHKQEIRKNPEF 
RVQ FQDMCATXGVDPIxASGKGFWSEMLGVGDFYYELGVQtlEVC 
IiALKHRNGGLITLE ELHQQVLKGRGKPAQDVSQDDL I RAI KKLK 
Ar/STGFGIIPVGGTYLIQSVPAELNMDHTWnjQIAEKNGYVTVS 
EIKASLKWETERARQVLEHLLKEGLAWLDLQAPGBAHYWLPALF 
TDLYSQEITABEAREALP 


5644 


B3 


113 8 


fKianvjOHvyuj. AiSVtjFVUUJMHPCiW'XVAyQFQEKKRFTEEVIEyFQ 
KfCVSPVHLKILI*TSDEAWKRFVRVAEIiPREEADAI.yEArjKNI*TP 
YVAIEDICDMQQKEQQFREWPLKEFPQIRWKXQESIERLRVIANE 
lEKVHRGCVIANWSGSTGILSVIGVMrAPFTAGLSLSITAACSV 
GLGIASATAGIASS IVENTYTRSAELTASRLTATSTDQLEALRD 
ILHDITPNVLSFALDFDEATKMIANDVHl*LRRSKATVQRPIiIAW 
RYVPINVVETLRTRGAPTRIVRKVARNmKATSGVLVVLDVVNL 
VQDSLDIiHKGEKSESAEIJuRQWAQELEEWLNELTHIHOSIiKAG 


5645 


S37 


799 


VQSVRDIiKRliSPTDPPGDSGNRDVTEiEDPVTGPLNSASSQVPTli"' 
YLCMNSLLGHSSVEDARATMELYQXSQRIRARRGLPRIAVSD 


5646 


3745 


3328 


AEQYGTS PHIiLPTMH^SSCLPPAm/TTKAATPPPLVLSLTTADP ' 
AGKPAPCRVTLTUiRAS I PATKRASEXSS FIKMPPEELE YII»GF 
LSliDKFHVHVSVYSAICHFQKEGTGMSRSFTCTPELPPRl^THIi 
RAEGGAQ 


5647 


288 


800 


GVIMATSELSCEVSEENCERREAFWAEWKDLTIiSTRPEEGCSL^ ' 
EEDTQRHETYHQQGQCQVIiVQRSPMLMMRMGILGRGI.QEyQLPy 
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NSDCXID: <WO 01533 12A1_L> 



wo 01/53312 PCT/USOO/34263 



SBQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acxci segment containing sxgnal pepti"ai~- 
(A=Alanine, C^Cysteine, D=A3partic Acid, b= 
Glutamic Acid, F=Phenylalanine, G=Giycine, 
HsHistidine, I=Xsoleucine, K=Lysine, 
L»Leucine, M=Methionine, N-Asparagine 
P=Proline, Q-Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=.Tryptophan, Y«Tyrosine, X^Unknovm, *.^Stop 
Codon, /^possible nucleotide deletion, 
\«posslbie nucleotide insertion) 








yKVLPLPIFrPAKMGATICEEREiyrPIQLQELtJ^iETAi;GGQC\^ 
RQEVAEITKQLPPWPVSKPGALRRSLSRSMSQEAQRG > 


5648 
5619 


7 


1518 


VLSr.LCGRHfc:Ai^EVGAEWPPPTCSPKICSGLQQAGNTDWSLTM 
APQSLPSSRMAPLGMLI/SLLMAACFTFCLSHQNbKEFALTNPEK 
SSTKETERKETKAEEELDAEVLEVFHPTHEWQALQPGQAVPAGS 
HVRLMIOTGEREAKLQYEDKFRNWLKGKRLDINTNTyrSQDLKS 
AtAKFKEGAEMESSKEDKARQAEVKRLFRPIEBLKKDFDEUIW 
IETDMQIMVRI.INKFNSSSSSLEEKIAALFDLEYYVHQMDNAQD 
LLSFGGLQWXNGJLNSTEPLVKEYAAFVLGAAFSSNPKVQVEflJ 
EGGALQKI.r.VILATEQPI,TAKKKVLFALCSI,LRHPPYAQRQFLK 
LGGLQVLRTLVQEKGTEVLAVRWTEiDYDIiVTEKMFABEEAELT 
QEMSPSKLQQYRQVHLLPGLWEQGWCEITVHLLALPEHDAREKV 
IiQTLGVt*LTTCRDRYRQDPQLGRTLASL<2AEYQVLASLELQDGE 
DEGYFQELLGSVNSLLKELR 


5650 


1172 


3006 


KLQEQLDAli^EEIRMIQEEKESTELRAEElETRVTSGSMEALNIi 
KQLRKRGSIPTSLTDLSI>ASASPPLSGRSTPKLTSRSAAQDLDR 
MGVMTLPSDLRKHRRKLliSPVSREENRBDKATXKCETSPPSSPR 
TLRLEKUSHPALSQEEGKSALEDQGSNPSSSNSSQDStHKGAKR 
KGIKSSIGRLFGKKEKGRLIQLSRDGATGHVLLTDSEPSMQEPM 
VPAKLGTQAEKDRRIiKKKHQLLEDARRKGMPFAQWDGPTVVSWt. 
ELWGMPAWYVAACRANVKSGAIMSALSDTEIQREIGISMALHR 
LKLRI*AIQEMVSI,TSPSAPPTSRTSSGNVWVTHEEMETI»ETSrK 
TDSEEGSWAQTrAYGDMNHEWlGNEWLPSLGLPQYRSYPMECI^V 
DARMI.DHX,TKKDLRVHl.KMVDSFHRTSZ^YGIMCIiKRI:NYDRKE 
LEKRREESQHEIKDVl,VWTNDQVVHtArQSIGLRDYAGNlJHESGV 
HGALl,ALDENFDHNTLALII,QIPTQNTQARQVMEREFlWrLLAr.G 
TDRKLDDGDDKVFRRAPSMRKRFRPREHHGRGGMLSASAETLPA 
GFRVSTLGTDQPPPAPPKKIMPEAHSHYI.YGHMI.SAFRD 




1172 


3006 


tU^EQLDAINEEIRMIQEEKESTELRAEEXErrRVTSGSMEAItNL 
KOLRKRGS 1 PTSIfTDLSLASASPPLSGRSTPKLTSRSAAQDIiDR 
MGVMTLPSDLRKHRRKLJUSPVSREENREDKATIKCETSPPSSPR 
TLRLEKLGHPALSQBEGKSALEDQGSNPSSSNSSQDSI/HKGAKR 
KGIKSS IGRLFGKKEKGRLIQLSRDGATGH VLLTDSEPSMQEPM 
VPAKLGTQAEKDRRLKKKHQLLEDARRKGMPFAQWDGPTWSWI. 
ELWGMPAWYVAACRANVKSGAIMSALSDTEIQREIGXSNALHR 
LKLRLAIQEMVSLTSPSAPPTSRTSSGNVWVTHEEMETLETSTK 
TDSEEXSSWAQriAYGDMNHEWIGNEWLPSLGLPQYRSYFMBCLV 
DARMl^HLTKKDIJiVHLKMVDSFHRTSLQYGlMaCjKRLNYDRKE 
LEXRREESQHEIKDVLVWTNDQWHWVQSIGbRDYAGNIiHESGV 
HGALLALDENFDHNTIJU.ILQIPTQNTQARQVMEREFNNI.tJUiG 
TDRKUDDGDDKVFRRAPSWRKRFRPREHHGRGGMLSASAETLPA 
GFRVSTLGTLQPPPAPPKKIMPEAHSHYLYGHMLSAFRD 


5651 


646 


1869 


AKQGQRQPWG*EARAKGPASESPRV*EGSGWEGP7VSP*tPGSTL 
AW6EGAGIR*ASGLTAAGAASAAAA/PPPTRGGPAPAGCGRAPP 
WPAPriRVPTHGRAPAPRSf>AAPRAPALSHGTAAAAI,SPASPA6P 
ADP*LPGHSSQSPPRG*RWGRSRSAPAPAHPEHPAPAGSASASQ 
QTPGWPGSCCLAQGWQAEPLGAPGAEDGNpvpPOPGF-PT.rTTr'c 

PAGSWAGXAGYG*AGAPGTQATAPRAAGQTPVAAAPNCRV*GSA 
PALHRAPAAADPGSPLQAPPRAWASPAAAGPGLSSSDYCGGLGA 
GWRAGISPELIiGAAGI,SDNWARCPGPGPAE*GGQPGCRTIPASA 
CMPSP PVEGSLGLSRKGHGDLPSQAR* GWHECRRARHLVPLPRI. 
t>GPRGRTGRPSSPS 


5652 
5653 


735 


1 


HHKKYQHIHQKSFSCPEPACGKSFNFKKHLKEHMKtaSDTRDYI 
CEPCARSPRTSSNLVIHRRXHTGEKPLQCEIOGFTCRQKASLNW 
^QRKHAETVAALRFPCEFCGKRFBKPDSVAAHRSKSHPALLLA 




66 


1401 ' 1 
( 
( 
1 
C 


=iGRLQSRGRLTLGLVXiLIiLDILGARQHGQRVSHGWKGGFLTAPL 
ZFPQPCQPGTRRGRRIiSLKEATEPQLAMAEEFVTIiKDVGMDFTL 
3DWBQLGLEQGDTFWDTALDNCQDLPLLDPPRPNLTSHPDGSED 
jBPLAGGSPEATSPDVTETKNSPLMEDFPBEGFSQEI /SRDVia 
5HLLELQPRESL YRGHLVR ♦ FARRSRKSSEV* YCHQRGKSHGMQ 
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SEQ 
ID 
NO ; 


Predicted 
beginning 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid E- 
Glutamic Acid, F=Phenylalanlne, G=:Glycine 
H=Histidine, l-lsoleucine, K*.l,ysine, 
L-Leucine, M=Methionine, N=:AsparagiAe 
P=Proline, Q=Glutamine, -R^Arginine, 
S=Serine, T=ThreoninG, V^Valine 
W.Tryptophan, Y^Tyroaine, X^UnkAown, *.Stop' 
Codon, /=po3sible nucleotide deletion, 
\=possible nucleotide insertion) 




5554 






" ^^'K'XQSCVHRFHGRRFHG\DNyst;KTJU'rPAKSKEYRGEFF 
SYSDHSQQDSVQEGEKPYQCSECGKSPSGSYRLTQHWITHTRfeK 
PTVHQECEQGPDRKASHSGyPKTHTGYKFYVCNEYGTPPSOSTY 
LWHQKTHAGEKPCKSQDSDHPPSHDTQSGEKQKTHTDSKSYMCN 




565S 


3 


598 


TLPLFPGRRFRGVgRRCUAVAAKKWSTGGNVSlWQRRDSVRMSAL 

NWKPFVYGGIASITAECGTFPIDLTKTRFQIQGQTNDAKFKEII 

YRGMLKALVRIGREEGLlCALYSG*VGLHAFI.CHCSriPKMGIDFR 

PRLHRSQVKSLRCV*KEQlA**/MFSLLISTLISKyiYYAADVI. 
EKLFYYXQVQTDNNKKICLFKNl ^xxyjfAADVI, 




5656 


2 


867 


KPPGIRAPRQI,HPAAGKKPUASARPRFRPTVtLhDPF0L5FPPP 
PLSYPSVFPAVARVLPQRSGDYRAAGMPQLSGGGGGGGGDPELC 

atdemipfkdegdpqXrekifaeivnpeeegdladiksslvnes 

EIIPASNGHBVARQAQTSQEPYHDKAREHPDDGKHPDGGLYNKG 

psyssysgyimmpnmnndpymsngslsppiprtsnkvpwqpsh 

AVHPLTPLITySDEHFSPGSHPSHIPSDVNSKQGMSRHPPAPDI 
PTFYPLSPGGGGQITPPLGMC2GQP 




5657 


228 
105" 


1066 


FKKVPPLiPEFA^GPGAAFFHSGkLOkSLiTKDgAftr-Rtinf'PCPivM - 

lvlrsoltkalasrtiapqvcssfatgprqypgtpyefrtytlk 
psnmnafmenlickwxhlrtsyselvgfwsvefggrtnkvfhiwk 

YDNFPHRAEVRKAIxANCKEWQEQSIIPMIARIDKQETEITYXIP 

wsklqkppkegvyelavfqmkpggpalwgdaferainahvni^ 
tkwgvphteygelnrvhvlwwnesadsraavrhkshedpiswg 




56S8 




1052 


GaRLQSPRVQMFVQPPSKDTEEMEAEGDSAABMNGEEEBSEEER 

sgsqteseeessemddedyerrrsecvsemldlekqfselkekl 

PRERIiSQLRLRLEEVGAERAPEYTEPLGGLQRSLKIRIQVAGIY 
KGFCLDVXRNKYECELQGAKQHLESEKLLLYDTIiQGELQKRIOR 
LBEDRQSIiDLSSEWWDDKLHARGSSRSWDSLPPSKRKKAPLVSG 
PYIVYMLQEIDIIiEDWTAIKKARAAVSPQKRKSDXDLDPAVHSO 
^^S^CTQDSRLPPADRRTHRPLRVCPARI^LWCCWALPiail! 


5659 


2346 


3541 


TERRVYNPWPEPDPD^ClQEbPWNLPMSIKl'LVDNIQRYVEDGK 

NQLLLALLKCTDTELQLRRDAIFCQALVAAVCTFSEQDIAALGY 

RYNNNGEYEESSRDASRKWLEQVAATGVLLHCQSLLSPATVKEE 

RT^a^EDI*m:LSELDNVTFSFKQLDENYVA^mTVPYHrEGSRQA 

LKVIFyLDSYHFSKLPSRLEGGASLRLHTAX,FTKVLE^EGLPS 

PGSQAAEDLQQDIKAQSLEKVQQVYRKLRAFYLERSNUPTDAST 

TAVKIDQLIRPIKAUJELCRLMKSFVHPKPGAAGSVGAGLIP-S 

SELCYRLGACQMVMCGTGMQRSTLSVSLEQAAILARSHGUiPKC 

IMQATDIKRKQGPRVEIIAKNLR\nCDQMPQGAPRLyRLCQPKMN 
GDIi 


5660 


2 


696 


WKRSGEVSPKGELGAWRGNSGRPKIIGRAAEAENEDRTLGRLI^P 

GNERSQPRSPLRLrAPQLKAEAAADKGliAPVPppFSSGHSGPCV 

EREGEGQRGRGRSRRGAHLELKPSPGLRAGAPTDRGRGGPAEVA 

AAGGRRMVQKESQATLEERESELSSNPAASASASLEPPARPaPr- 

EDNPAGAGG\AAVAGAAGGARRFLCX3WEGFYGRPWVMEQRKEL 
ET?RL^ KWKXiM T YIi 


5661 


229 


853 '■ 1 
( 
( 

1 


b-^^^MWAfSELiPMPLLINLiViJU/jl.VATVTLlPAFRGHFIAARL"- 

2GQDLNKTSRQQIPESQSVISGAVFLIILFCFIPFPFLNCFVKE 

3RKAFPHHEFVAI*IGALLAICa^lFLGPADDVLNLRWRHKLLLP 

rAASLPLI.iyrVYFTNFGNTTIVVPKPFRPILGLHLDW5R*SYHCC 

^YGTYFREPPLVLHXXjLQVFIiFCLCVFPDPFW 


5662 


2 


473 2 
} 
I 

5 


^NLYPSPCGGZPKLPGLPREAAAALGASFIAEAPLPVfVRGSGt. 
^GMAVTCDP:<AFLSXCFVTLVFIjQLPLAS1CqN*GTDSCASRGK 
U)FDVTGPHAPir,AMAGGHVELQCQbFPNXSABDMELRWYRCQP 
ILAVHMHERGMDMDGEQKWQYRGRT 




2 


1318 I 
£ 


iRKEGRCRRGSMRGVWAAPAEGLGGRGMLGVRCLLRSVRFCSSK 

•fpkhkpsaklsvrdalgaqnasgerikiqgwirsvrsqkevl/ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
CO r re s pondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide^ 
(A^Alanine, C=Cysteine, I)=tAspartic Acid, 
Glutamic Acid, F=sPhenylalanine, G=Glyciae, 
H=Histidine. I— Isolt^iiri k*— t ^^o-; 
L^Leucine, H==Methionine, N:=^paragine , 
P= Proline, Q=Glutamine, R=.Arginine, 
S^Serine, T=Threonine, V»Valine, 
W^Tryptophan, Y-Tyroaine, X-Unknown, *=Stop^ 
Codon, /^possible nucleotide d&l<»h4^n 
\-possible nucleotide insertion) 








X.HVNDGSSL£:SI.QWADSGbDSRELTFGSj:;Vh;VUGQLIKSPSKR~- 

QNVELKAEKI KVIGNCDAKDPP I KYKERHPLE YLRQYPHFRdlT 

NVLGS I LR IRSEATAAIHS FPKDSGPVHTTmOT TTOKmet>o>/^c 
*^ p^sft-je vrimii'i i X^£^NDSEGAGB 

LPQl^PSGKLKVPEENPFNVPAFLTVSGQLHLEVMSGAFTQVFT 

FGPTrRAENSQSRRHLAEFYMIEAEISFVI>SLQDLMQVXEEr.FK 
ATTMMVLSKCPEDVELCHKFTAPnnKnoT aumt vmvtot tt^o™. 

AVEXLKQASQNFTFTPEWGADLRTEHEKYLVKHCGNIPVFVINy 
P^TLKPFYMRDKEDGPQELEGSVA*HSLGLMILLSIW1GQP 


5663 
5664 


119 


698 


PADIGR5TAKTPGPPRfiTlpMnn'o'pvr'M>inr vi**it. f^^r^y^-m ^ ■■■■ 

VQSYFEKGPLTFRDVAIEPSLEEWQCLDSAQQGLYRKVMLBNYR 
NLVFLG lALTKPDLITCLEQGKBPWNXKRHEMVAKPPVlCSIIPP 
QDLWAEQDIKDSFQEAILKKYGKYGHANFQLQKGCKSVDECKVH 
KEHDNKL^QCLXPKKKK 




X18 


S72 


£>LSMESNHKSGDGLiSGTQKEAALRALVQRTGySLVQENGQRKYG 
*^irr iroH ^jrtrti' t-aKVA-ii i iricJKIjPRDLFEDELl PLCEKIGKI YEM 
RMMMDFNGNNRGYAFVTPSNKVEAKNAIKQLNUTYEIRNGRLLGV 
CAS VDUCRLPVGG X PKTKK 


S666 


347 


702 


WQHLIILLHCERTSPAMITSELPVUJDSTNBTTAHSbAGSELE 
ETBVKGKRKRGRPGRPPSTNKKPRKSPGEKSRIEAGIRGAGRGR 
ANGHPQQNGEGEPVTIjFEWKIiGKSAMQRC 


5667 


213 


540 


w^v-jjjT AOk-ra^ix ii*wiNUUuf vi-f iNiiailPUEYKlAALVFYSCIPII ' 

GLFVNITALWVPSCTTKKRTTVTIYMM.WALVDLIPIMTX,PFRM 

PliTYAKDEWPFGEYFCQILGA 


5663 


1 


695 


**jr*^jroj»oAJv»xjt'o vij-AjVi>iiUVKi>Ai»iifclAvvPHLPiUUUU^ 
SGDAASSTPPSTRFPGVAIYLVEPRMGRSRRAFLTGLARSKGPR 
f ■'bk'uwn^.nT V ncic i, oACJCtAVaWQEKKMAAAPPGCTPPAT.T.r) 

ISWLTESLGAGQPVPVECRHRI^EVAGPSKGPLSPAWMPAYACQR 

PTPLTKHNTGLSEAIiEILAEAAGFEGSEGRLLTFCRAASVLKAL 
PSPVTTLSQXjQ 


S669 


691 


894 


CSgLFCIPDJbgLQFL.IXSKKEEKAVIiVGGBWSPSLDGLDPQADPQ" 
VLVRTAIRCAQAQTGIDLSGCTKW 




407 


1 


DSGAPEGLSPLMSTQEGLSMHAHPOAYTPFIYLHARKRRGEIGD 
ADSRFNDRYAHKSAQLYFLYFVCWIFQDVYYFTIKEKNHFFFDK 
ARGAPTKYSGS PIGSPTTTPPTRPPSFNIiHPAPHr.r.a.^iMnT nvr 

NSQ 


5670 


3 


373 


SStfJUTMAWIPLLLPLLlLCTVSVASYEIAQPSSVSVSPGQTAK 
ITCSGDVIAKKYARWFQQKPGQAPVLVIYKDTBRPSGIPERFSG 
STSGTTVTLTISdAQVEDEADYFCYSATDNFLWVF 


5671 


280 


524 


KFPPKKTPPHUiMKSAITLWQFLLQLLLIX^KHEHLICWTSNDGE 
FKLLKAKKVAKLWGIiRKNKTMMNYDKI^RALRLLFMT 


5672 


2 


557 


FVPATPDPGVWbi^PSkUPAMAKRSST.YIRIVEGKNLPAKDITGS 

SDPYCIVKVDNEPIIRTATVWKTLCPFt^GEEYQVHLPPTFHAVA 

FyVMDEDAX,SRDDVXGKVCLTRDTIASHPKGKFSLPSHTGIiPSP 

WPPSHSETSPLGSVWSPAQGKPFLLSPBAGATFCTPGLCSAACS 
QAWLLLPLP 


5673 


327 


696 


IWADQISHWSAGRXKNRTRIPECIHSSAATTLAGPHTMEGESr- 
lUjb&u 1 L,i QAQDDEKNQRTITVNPAHMGKAFKVMWEIJiSKQLLC 
DVM1VAEDVEIEAHRW14AACSPYFCAMFTGEMS 


5674 
5675 


17 


984 


GGGbMEGESTi;AVi.SGFVU;AiJU?'QHI^rDSDTEGFl,D3EV^^ 

AKNSITDSQMDDVEWYTIDIQKYIPCYQLPSPYNSSGEVNEQA 

LKKILSNVmm^GWYKFRRHSDQXMTPRERIJLHKNI^EHFSNQ 

DbVFZiLLTPSIITBSCSTHRLEKSLYKPQKGbFHRVPLVVANLG 

MSEQLGYKTVSGSCMSTGFSRAVQTHSSKFFEEDGSLKEVHKIN 

EMYASLQEELKSICKKVEDSEQAVDKLVKDVNRLKREIEKRRGA 

ClIQAAREKWIQKDPQEWIFLCQALRTFFPKSEFLHSCVMSLKID 

MFLKVAVTTTTISM 




80 


753 

J 


EGSRRGPTRi^I^ARAGRLHFPPGFSSRLXHPRGVSECRRPPG 
fCSGVPVSAPGSDGKWWEERPGMFSLMASCCGWFKRWREPVRKVT 
[iLMVGLDNAGICTATAKGIQGEYPEDVAPrVGf^KXNIiRQGKFEtf ' 
riFDriGGGlRIRGIWKNYYAESYGVIFWDSSDEERMEBTKEAM 



358 



N'snocin <wo ois-^-^ipai i > 



wo 01/53312 



PCT/USOO/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

CO r re sp ondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Ami.no acid segment containing signal peptide" 
(A^Alanine, C^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F^Phenylalanine, G=Glycine, 
H-Histidine, I-Isoleucine, K^Lysine, 
L=Leucine, Mx:Methionine, N^^Asparagine, 
P-Proline, Q=:Glu t amine , R«Arginine, 
S=Serine, T^Threonine, V=Valine, 
W-Tryptophan, Y-Tyrosine, X=UnJuiown, *=.Stop^ 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








SEMpRHPRlSGKPILVJLANKQDKEGALGEyaJVIECLSLEl^^ 


5676 


2 


93 0 


PVSSPPPRPVQPARPGGFGI^GRRSL*LCQVASTPAHVGVMRS?V~~ 
RDLARNDGEESTDRTPLLPGAPRAEAAPVCCSARYNT ATT .A B-Pi" 

FFI^nrALRVNIiSVALVDMVDSNTTLEDNRTS;<ACPEHSAPIKVK 
HNQTGKKYQWDAETOGWlLGSFFYGYIITQrPGGYVASKIGGKM 
LLGFGILGTAVLiTLFTPIAADLGVGPLIVIiRALEGi/SEGVTFPA 
MHAMWSSWAPPLERS KLLS ISYAGAQLGTVISLPIiSGI ICYYMN 
WTYVFYFFGTIGIPWFLLWIWtiVSDTPOKHKRTqHVPTrPVTTce 
Xt 


SS77 


1 


L028 


PPRDGFLELRRLSVPLCSGPCPLTSLSRQGEKSGGKliVAAARAA 
VTAETHPLPLLAPLAVCQSVKSPAACQVRPRPRAVALPAALGGP 
GRSI.PGLTAATMSSFSESALEKKI.SELSNSQQSVQTLSLMLIHH 
RKHAGPI VS VWHREIjRKAKSNRKIjTPLYTAWnirrnwo irti vr-rit3'» 

TREPESVLVDAFSHVAREADEGCKKPLERLLNIWQERSVYGGEF 
IQQLKLSMEDSKSPPPKATEEKKSLKRTFQQIQEEBDDDYPGSY. 
SPQDPSAGPLLTEELI KALODIiENAA<!OnATVDriWT hot o/^c^r^ 

DVSLX^BKITDKHyVAERl^KTVDEACLRNRGPGTS 


5678 
5679 


3 


593 


SSSPPSSTPSi.PLPFYLbLGQIiRLQl.LWGTAHI^GAGEAAPCPG 
GSGRTAAPRTRADPAAQSIiMIMNKMKNFKRRFSLSVPRTETIEE 
SLAEFTEQFMQL-HNRRNENIjQLGPLGRDPPQECSTFSPTDSGEB 
PGQl,SPaVQFQRRQNQRRPSMBVRASGALPRQVAGCTHKGVHRR 
AAALQPDFDVSKRLSLPMDI 


5680 


2 


623 


XJfSRVDDFVAVPGAIMDEDYYGSAAEWGDEADQQnnpnnqnpr'ij — 
DDAEVQQECI^HKFSTRDYIMEPSIFNTLKRYFQAGGSPENVIQIi 
l^ENYTAVAQTVNLIJ^WLICn'GVEPVQVQETVENHLKSLt.IKH 
FDPRKADSl FTEEGETPAWLEQMI AHTTWRDLFYKIiAEAHPDCL 
MLNFTViCVGRVLEriRRKVFMNVYFWLLVCFL ^ 




258 


592 


RRLTSTSEKLQNRN;aHTPLESLIHP6ptJ^KGFGIMFGKKlfK-yTT?~~ 

ISGPSNF^RVHTGFDPQEQKFTGLPQOWHSLIADTANRPKPMV 
DPSCITPIQLAPMKTIVRGNKPC 


5681 
5^82 


45 


869 


IiLCAKTLGVRTKESQABGYNRSGINNHQAEDPRFCPSFCWMRSA 
RQTRPQRU^KEAARPPTPGSCPGGTGMDGKKCSVWMFLPLVFTL 
FTSAGLWIVYFIAVEDDKILPLNSAERKPGVKHAPYISIAGDDP 
PASCVFSQVMNMAAFriALWAVLRKIQLKPKVLNPWI^lSGLVA 
LCIiASFGMTLLGNFQLTNDEEIHNVGTSLTFGFGTLTCWIOAAI* 
TLKVWIKNEGRRVGXPRVILSASITLCVGPLLHPHGPKHPHVCS 
QGPVGPGHVL 




39 


622 


PSRSCLGTMRKWRHREVNLPEVTQQDAVCPAPIPSPGLSAQTGIi 
QKIWGTXHCQVCPGAPAWPGSPWHEEMGliLLr.VPLLIJJPGSYGL 
PFYWGFYySNSAWDQNrx;NGHGKDXiLMGVKIjWETPEETl.FTYQ 
GASVILPCRyRyEPALVSPRRVRVKWW:<I>SENGAPEKDVI.VAIG 
LRHRSFGDYQGRVHliRQD 


5683 
5684 




778 


GSCGATALITRCLAWSVLljykLAMATYTClTCRVAFRDADMQRA 
HYKTDVJHRYNLRRKVASMAPVTAEGFQERVRAQRAVAEEESKGS 
ATYCTVCSKKFASFMAYENHIiKSRRHVELEKKAVQAVNRKVET^ 
uu<>.viuv3 V V _/fVL>^u'uii./4/\xyyM.x aAQPSMS PKKAPPAPAK 

EARNVVAVGTGGRGTHDRDPSEKPPRLQWFEQQAKKLAKHSEDD 
SEDEBHDLC 


5685 


195 


577 


TWCl?-R^YJ^PRVIM:<ALDKPPYLTV61'bVSAKYRGAFCEAKIKr 
AKRLVKVKVTFRHDSSTVEVQDDHIKGPLKVGAIVEVKNLDGAY 
QHAVINKLTDASWYTWFDDGDEKTLRRSSLCLKGERHFAESET 
liDQLPLTWPEHFQTPVIGKKTNRGRRYE 


5686 


779 


1262 


IJ^LCKJPVVHCFLLFPPFRFiiHHMIPGPPGPHTTGIPHPAIVTPQ 
VKQEHPHTDSDLMHVKPQHEQRKEQEPKRPHIKKPLNAFMLYMK 
EMRANWAECTLKESAAINQILGRRWHALSREEQAKYYEXARKE 
RQLHMQLYPGWSARDNYVSPSSIPVTVLHS 




128 


1181 

■ 


CrrWWQWITLtJDIWDNHPTWKJOAPYYINLVEMTPPDSDVTTWA 
mPDLGENGTLVYSIQPPNKFYSLNSTTGKlRTTHAMbDRENPB ~ 
PHEAELMRKIVVSVTDCGRPPLKATSSATVFVNLLDLNDNDPTF 



3S9 



BNSDOCID: <;WO 0153312A1 I > 



wo 01/53312 



PCT/USOO/34263 



ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptioT^ 
(A=Alanine, C^Cysteine, D=Aspartic Acid, 
Glutamic Acid, F^Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K«Lyeine, 
L«*Leucine, M^Methionine, N-A3paragine , 
P=^Proline, Q»Glutamine, R=Arginine, 
S-Serine, T=Threonine, V«Valine, 
w=Tryptophan, Y^Tycosine, X«Unknown, ♦«SCop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








QNLPF VAEVI.EGI PAGVS I YQWAIDLDEGLNGIiVSYRMPVGMP 
RMDFLINSSSGWVTTTELDRBRIAEYQLRVVASDAGTPTK^T 
STLTIHVLDVNDETPTFFPAVyNVSVSEDVPR\GSGWSG*AARN 
NDVGLNAELSYFITGGNVDGKPSVGYRDAWRTWGLDRETTAA 
YMLlLEAIDNGPVGKRHTGTATVFVTVXiDVNDKRPIILQSSYV 


5687 


17 


917 


AAPPAPPDG/ PPP/ PPPAPPT/ PGPAA/ APASSCQPRLS AGRAA" 
QGPGGAAAVGHVLWPAVGPVRVNPGLQTPVPRPEIiLPGPXsSS 

lhsdss yppdaglsddeeppdasi.ppdpppi.tvp/ada/ pmpvt 
sgcrmpstsase/aaggqgacthakgsetpppaspqtsepapsp 

t.PPHLTGGPGMYSSEAKLPNSFSCLGl*AGTGAGI*GTASAHGTG 
PPVLPHVCTPSIiANPQPXAVGPEASSLPriGVSGXGMSA/SAPrs 
SSPFVAIGSCWLRGIPPPGSGFLCPGRAPGPVPrTTHGQEGQGP 
VLDI 


5688 


1 


420 


LTKVJDLFGNCYRLLKTG I EHGAMPEQVGVYWYS / Ci:.YDSRKI.FF ' 

*SHMIIRSLL*KVIDDSLGQLPLLRELLL* *r.NVTDRCI ILAYV 

LRVEKTFAITYLKNFTVKVDFSLLGErPLISMAAILKLWIMKID 
DGYIPAVF 


5689 


X504 


3 


helsg khismvsgntcnwhpgghspggggqge itskdrgbi pal " 

iwa/rk?igtwtatkpthrag*ggaeeyqpppqpcegprstsrg 

geg*ghavgpgreigkegslpflgpkalgf*sascqrafeggah 

gstarkpapatpgtrhprtmetrevaqgwpagprsqfwdqhphs 

3pgehrpsg\splpacpprat?pkagavasatgtg\pqiipgsrgkq 

klprtreppllqagwavrkppwseakeglgqagrpsgmdssasx 

pqtpggrgst*ewglpxivlgphhdvk*rsdrlg*pp*ggqggggh 

6apstpgpggeaw*dpqqtsrpkpgpqay*ge\gspgiiqcpcsk 

el*rvppgslgpstqckyeptdkhs\ggadaqlevstagsrstf 

gqelkgpldagriiwpgapsassshr*gg*eraragaghrgst*a 

sskieqgrprpgptsdaiiadveggaes/gphpwplpgtlpnr/p 

GSPPPA*ASAGRKG'rVSTLGGGi:X 


5690 


1424 


58 


PSPFAGVCTU^PAPLPLIJUiARRDRRPCSPGAEAAPWQTGGPAXD 
GAWRTSVSAlitRiGATG/APCSPGAEAAPWQTaGPAIDG\DGELP 
*VRSEEAPRGCGAEGGGPGSOPVRRPGAGRGAHAGCGRQQDPEP 
DGLRHRQHGAASHARHRLQRLRPGHHQNRHVRRDPQAPPGGPAP 
GHAAALPERTRGVAEPPAWAHAGSDAWRAGR*SQRT*ERARPRH 
PTFQGRAGS\GQPGYQPPNPHPGPSSPPAAP\GPRGA*GNPQLE 
KAPRSDRWPSC2GLRTRIRRPETPD0GPPSPAGSSASASTFRCTS 
SLSrJtiGP/PGAHNIJDTAPQDR*HGP*GDKRGAPGVAGEDPRPP* 
GNPVR* LIjIiMP/GVA* RHGTS PFI^PSLGENGGQMDSGNIiFGTP 
KG*SHPAFTKST*S«EAEKSYWNHPHR\DRGRQGVRINCLRVGE 
s em wg p ys aprpgtvfltssfls paseeh \ pegsssfntpfppag 
PEGDPGLWSPGIjLP 


5691 


107 


550 


ISNDPSPGYNIEQMAKRGKKI;VEL,PYTVKGMDVSFSG-LSFIED^ 
VAHRMIiATGECTPEDLCFSLOVMQ* KTOTESWG*RFYI VEQN* S 
GDAPLI FS P YI*SLTGNCGFAMLVEITERAMAH\ CX3SPGGPSLWG 
GVGVYVLIiES VPLS YS 


5692 


1193 


S48 


TQAWTRAEKDRKGS VRALRLHLERGP PT * RGSHPIi\qS VPCIQK 

ps i fss yp i /glpqsgqepgpvgeqqpvrrpeqpscgpashmpri 
tsrsvppgrgalppdslstrkglprpstaghrvresghkvpvsq 
rlnlpvmgatrsnlqpprkvavpgptr*rdqdskqdfsskpriqs 
vpglastqqtltpadsg pgtggrdatraglpgvetkgngvd 


5693 


1258 


1330 


ALTWPVRKGTTWWAQPHGCSNLVSRARLDLSSRPSQNTEPQAP 
* QAGPPSSLRPP\SRRR * APEWPKRATGSRCRGLSAPPWPWPAA 
RGE/PQSAPSHAP/pNSJRPSGTRHP/PGPSSRVTiYSPSIiPRNS 
PEAIVWRSSRFPLWFPIjRCCFWVSGFKDPNPVLRFF 


5694 


3 


1338 


GS KEPARSLHRRGSGHKSSAGKWGSVTLSTAGAIiG*KQLHd* WT 
QRCL\NNLSSEEFNASSSLMSLPSTPTASRRNSTIVLRTDSEKR 
SIAESGLSWFSESEEKAPKKLEYDSGSLKMEPGTSKWRRERPES 
CDDSSKGGELKKPISLGHPGSLKKGKTPPVAV^SPITHTAQSAL 
KVAGKPEGKATDKGKl4AVKOTGLQRSSSmGRr>2iLSDAKKPPSa 
IARPSTSGSFGYKKP FPATGTATVMQTGGSATLSKIQKSSGIPV | 



360 



4 



PCT/USOO/34263 



SEQ 
ID 
NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleo t xde 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acxa segment containing signal peptide 
{A=Alanxne, C-Cysteine, D=Aspartic Acid, E= 
^ ii^^;"^^/''^*^' f'=Phenylalaaine. G=Glycine, 
H-Histidine, I=lsoleucine, K-Lysine, 
L"Leucine, M^Methionine, N^Asparagine, 
P^Proline, Q^Glubatirtine, R^p^gi^inQ, 
S^Serine, T=Threonine, V= Valine, 
W-Tryptophan, Y-Tyrosine, X=Unknown, *=:Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








KPVNGRKTai4DV5NSAEPC;yiAPGARSNIQyR5LPRMKSSSMS 
VTGGRGGPRPVSSS I DPSLLSTKQGGLTPSRI,KEPTKVASGR^T 
PAPVMQTDRBKEKAJfCAKAVALDSDWISLKSIGSPESTPKNQASH 

PTATKLAELPPTPLRATAKSFVKPPSL.ANLDICVNSNSr,DLPSSS 
DTTQCI 


5695 
5696 


3 


1338 


GSKEPARSi,KKRG3GHKSSAGKWGSVTLSTAGALG*KQLHQ*WT 

QRC:.\NNI.$SEEFWASSSLNSI.PST£>TASRRMSTIVLRTDSEKR 

SLAESGLSWFSESEEKAPKKLEYDSGSLKMEPGTSKWRRERPES 

CDDSSKGGELKKPISLGHPGSLKKGKTPPVAVTSPITHTAQSAI* 

KVAGKPEGKATDKGKLAVKNTGLQRSSSDAGRDRLSDAKKPPSG 

lARPSTSGSFGYKKPPPATGTATVMQTCGSATJbSKIQKSSGIPV 

KPVNGRKTSLDVSKSAEPGFtAPGARSNIQYRSLPRPAKSSSMS 

VTGGRGGPRPVSSSIDPSLLSTKQGGLTPSRLKEPTKVASGRTT 

PAPVNOTDREKEKAKAKAVALDSDWISIiKSIGSPESTPKNQASH 

PTATKLAELPPTPLRATAKSFVKPPSrANLDKVNSNSLDLPSSS 
DTTQCI 


5697 


3 


1338 


gskeparslhrrgsghk$sagkwgsvtlstagalg*kqLhq*wt" 

QRCL\NNLSSEEFNASSSLNSLPSTPTif^RRWSTIVLRTDSEKR 

SIAESGLSWFSESEEKAPKKLEYDSGSLKMEPGTSKWRRERPES 

CDDSSKGGELKKPISLGHPGSX.KKGKTPPVAVTSPITHTAQSAI1 

KVAGKPEGKATDKGKIAVKNTGLQRSSSDAGRDRI^nAKKPPSG 

rARPSTSGSFGYKKPPPATGTATVMQTGGSATLSKIQKSSGIPV 

KPVNGRKTSLDVSNSAEPGFLAPGARSNIQVRSLPRPAKSSSMS 

VTGGRGGPRPVSSSIDPSI.LSTKQGGLTPSRLKEPTKVASGRTT 

PAPVNQTDREKEKAKAKAVALDSDNISLKSIGSPESTPKNQASH 

PTATKlAELPPTPLRATAKSPVKPPSIANLDKVNSNSLDItPSSS 
DTTQCI 


569B 


1147 


47 


psealsppaupsapaprrsiisrlfgtspateaappppepvpaa 

QGPATVQSVEDFVPDDRLDRSFLEDTTPARDEKKVGAKAAQQDS " 

0SDGEALGGNPMVAGFQDDVDLEDQPRGSPPLPAGPVPSQDITL 

SSBEEAEVAAPTKGPAPAPQQCSEPETKWSSIPASKPRRGTAPT 

RTAAPPWPGGVSVROiGPEKRSSTRPPAEMEPGKGEQASSSESDP 

EGPIAAQMLSFVMDDPDPBSEGSDTQRRADDFPVRDDPSDVTDE 

DEGPAEPPPPPKLPLPAFRLKNDSDLFGLGLEEAGPKESSEEGK 

EGKTPSKENKKKKKKGKEEEEKAAKKKSKHKKSKDKEBGKEERR 
RRQQRPPRSRERTAA 


5699 


2 


666 


tiAEAAEPQEDliPPLSQSSRFFQEQQKMNKSIiGPVSFlCDVAVDFT 
QEEKQQLDPEQKITYRDVMLENYSNLVSVGYHIIKPDVISKLEQ 
GEEPWIVEGEPLLQSYPDEVWQTDDLIERIQEEENKPaRQTVFI 
ETLI*R/ERGNVPGNTFDVETNPVPSRKrAYTHSLCNSCER\GF 
J>lftSSKYI SSDGRYARMKADECSGCXSKSLmiKLEKTHPGDQAYE 

pnq 


5700 


2 

92i 


1448 

• 


KVRgpPGLWVKRTVPAMQCPAGLSRVPGVAG/DPSLPSFRGPRD 
iWUiK^ji iQXARHTRKrjYVQGPASGPPIjpRVSTQVAI*DEKPLA 
RPS/GRTNAPFPQGQKPAGKAAPGPAAAGRVAMR\PGHPGLIAS 
DSQRSSSXGSGWETPVPWS*AQPGWVSGra^LLGDPSGPGSL*RS 
TWI.VGGARGPEGSGVRGSGWPSGCSDlGWAIAGWWHS*Ht»DPNT 
WTQKWTGE/SPAPGEEG\VAPAPRGPTAEHQHCELTTESQySKN 
VPILFQNPSGALRSRRTEPAGWVPPTRHE*DDG*TAAPASGGAP 
VSTPTWAGTP/LWASLGPTDPQGKPGCRPPCALPKPAGPERSA* 
GGSLGCR/SMLPASSGPPPAPGPRRIAAGAHTSASARCPPAAAA 
GWQPRRPGFAGRAALPGPPHPPSS^RELGGLPGPGW+TIDPLPA 
^P^PGSAPPWGALGGWAAARASLPWSPSIiCLSFPAVTPVAGL 


5701 




S97 ] 
] 
1 


Nii«i«5VVfEINIY*RRSWIHKNSKSESHIiNQbHyFPPPTPMSARS 
KLHSIiGTAKNTGLPIiSGAPRQRAVFSGRTlCQEFSSCLQCAYLD 
S*CSIASSLIKAII,RVSVLSE 




59 


410 ; 

1 
I 


L^EKICSDTQEFISPEINPQICSWLiFDKGAK/NHATGKDSEFir' 
<WSWKNWLSTCR*MRPGPYFTPyTKINSK*IK/DANIRCETVKI. 
-EEKTGENLHDTGLGNVFLDMTPKTQPTKQK f 
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BNSDOCID: <WO 0153312AlJ_> 

I 



wo 01/53312 



PCT/USOO/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acxd segment containing signal peptid^ 
(A=Alanine, C^Cysteine, D»Asparcic Acid, Ess 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H==Histidine, I=lsoleucine, K=Lysine, 
L=Leucine, M=.Methionine, N=Asparagine , 
P= Proline, Q=:Glutamine, R=Arginine, 
S=Serine, T^^Threonine , V^Valine, 
W=Tryptophan, y«Tyrosine, X^Unknown, *.Stop 
Codonj /■•Dossim** mi^i <»/-kh'i 
\-po33ible nucleotide insertion) 


5702 


. 3 


1517 


ETFVDPSQCGGiPSDSPHfe>VlTPSRASESSASSDGPHPVITPSR'- 

ASESSASSDGPHPVIT?SRASESSASSDGLHPVITPSRASE3SA 

SSDGPHPVITPSRASESSASSDGPHPVITPSRASESSASSDGtH 

PVITPSRASESSASSDGPHPVITPSWSPGSDVTLIAEALVTVTN 

IEVINCSITEIETTTSSIPGASDTDLIPTEGVKASSTSD^»PAI*P 

DSTEAKPHITEVTASAETLSTAGTTESAAPHATVGTPLPTNSAT 

EREVTAPGATTLSGALVTVSRNPLEETSALSVETPSyVKVSGAA 

PVSIEAGSAVGKTTSPAGSSASSySPSEAALKNFTPSETIiTMDI 

TTKGPEPTSRDPLPSVPPTTTNSSRGTNSTLAKITTSAKTTMKP 

PTATPTTARTRPTT\A*VQVKMEVSSSCG*VWLPRKTSLTPEWQ 

KG*CSSSTGNSTPTRI.TSRSPYCVSGEANG/PSAAARHVPyAKR 

GCCP^PGPPPTDCSCVTVURGTQKVPMKGSMSKPLTPDVATGPS 

LTSTG VYVWGGAS P VPRGVIGIiTIiAH VLC FS KEKT 


5703 


14 


1117 


HHKDSRSQGI.PRTQECARPELRPLLCPRALWPVTRLSyRCPWQA 
PKAGIGTKAKPSESHLKLHPGWPSLDRQGEPArLGTGTGHCSDS 
RILRWHP*HTAAR* PRWRRIiPSSHRWTRHLGVI*RVQDKS * * VSL 
DPSCRPRFLRTC**yGMRSVASSSNPPPGWSGPGASVFPARPVS 
AIiPTGPRCW*APRGRTRQPCGWPRLSSPHATADWGPGCPIiSPSR 
GSWETAPGS*WCPWL*AARWTGWRTASGASAGLGRAADRPSAWA 
RRVAGLLPGQGLTVRR*H*TAGAPASVRSSQGATRSPAPGGDQC 

AOGRGPGSC*HPPPWPVSPSSPVPGPSGR*HLRGPUJSAARPRA 
AGWPRHSPHDTQTPEP 


5704 


23 


562 


GDYEFDSPyWDDlSQAAKDLVTRLMEVEQDQRITAEEAISHEWI — 

SGNAASDKNI KDGVCAQIEKNFARAKWKKAVRVTTLMKRLRAPR 

QSSTAAAQSASATDTATPGAAGGATAAAASGATSAPEGDAARAA 

KSDWVAPRRP^LPPQPQMEVPPQPimVSPQPPMEASLQPLMGE 
SPQP 


570S 


23 


562 


«^<r 1 "Ayiy4.;>y/uij!u^ijV AKijKJiVh;UlJyRITAEEAISHEWI 

SGNAASDKNIKDGVCAQIEKNFARAKWKKAVRVTTLMKRLRAPE ' 
QSSTAAAQSASATDTA'JPGAAGGATAAAASGATSAPEGDAARAA 
KSDNVAPRRP *Ii5»PQPQMEVPPQPLMAVSPQPPMEASLQPIiMGE 
SPQP 


5706 


1161 


610 


U1.GKFXAQDTVAIRKVKEVPGTGAMRHWILFTHKED*GGQAU3 
DYVAWTDNCSLKDLVRECERRyCAFNNWGSVEEQRQQQAELLAV 
lERLGRERBGSPHSNDLFLDAQLLQRTGAGACQEDyRQYQAKVE 

WQVEKHKQELRENESNWAYKAIJLJlVKHtmLHyEIFVPLLIiCSI 
LFFIIFLF 


5707 


28 


609 


GSPAPTPGPRRRPGRGTPSPGrRHHQGRAEPEPDAPERAPbRR* 
MFAIQPGIiAEGGQFLGDPPPGLCQPELQPDSKSNFMASAKDANE 
IJWHGMPGRVE PILRRSSSES PSDNQAFQAPGS PEEGVRSPPEGA 

EIPGAEPEKMGGAGTVCSPLBDNGYASSSLSIDSRSSSPBPACG 
TPRGPGPPDPLLPSVAQA 


5708 


44 


1925 


S FSWE£TXSPCFPKMPAE'p'wwr>s PV<^ T ^aaf^MPrfnop bvr KtT^K — 

QASVSRPHDRA*GEAVSLSLSSGDVCGHTDGGGAGSDPQAKPKP 
PRCPFTAMPSPRTKQKVRNKVCLLlAIRySDIPSDVSKAP\GPA 
GNPHDRSSTAA*LHRRAGAGSLCLSASLLPPSFSLGAPGAPSPL 
RVSPASGGPRKEGRQGSGG*AGGGGP\ARTHADLPCVGFVCSPP 
LLK*SDSPVKQLPA\SGQGSGAGMPPVGSSDILRPRPTSVSGTG 
RAAG*CSWQPAACCTPRSQ*WAVARSPSRCSRW*RQSGR*RG*S 
SRRRRGP*AAGRSTPAVP*PCS*GGAGRRAyACRTGWGyAPSR* 
LEPSGPTSGSAL* TWASHSTGA* *SRLCGTAGTGPLCSQSSRS * 
AG*RCCCTAASPCGGSGPSHPGSPSAHCr*SWSGGRTQPRAPSAH 
G31GRAMGSRCVCTCTGLPCPGIPLSGASPGGSGETGAGRSHTLK 
AARSRLSPRPGSGSRGSY*SHNDNWGTWPAPPSAGHLLVGG*NS 
QRTSSDH*YTGTRRPWAGPGTRCSTAPSRAAPPVSRCRPPPPPP 
PPRPPRLPAAAS /SGGASGSPA^VSCS CSCRAPAKPASS /GEAPA 
PPPRPEPPPPPARRP 


S705 


2 


2C31 


ITIiCPLPQTEKCIiWVVTEAATPLGiyLKARVEAGGLKEIiEISWG 
LHQI WRWGAWMRAGMGGCRCTGVMAPFAPK/KALS FLVNDCS 
LIHl\nWCi4AAVFVDRAGEWlCLGGLDyMySAQGWGGGPPRRGI£t: 
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SEQ 
ID 
NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segmenb containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E- 
Glutatnic Acid, F=Phenyialaalne, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L-Leucine , M«Me thionine , N^sAsparagine , 
P=sProline , Q=Glutamine , R=Arginine, 
S=iSerine, T-Threonine, V= Valine, 
W=Tryptophan, Y«Tyrosine, X»Unknown, *«st:op . 
Codon, /*«po33ible nucleotide deletion, 
\=possible nucleotide insertion) 








LBQYDPPEIJ^SSGRWREKRSADMt^^RIiGCLIWEVFNGPljPRA^ 

ALRNPGKIPiCTIiVPHYCELVGANPKVRPNPARFLQNCRAPGGPM 

SNRFVETNLFLEEIQIKEPAEKQKFFQELSKSLDAFPEDPCRHK 

VLPQLLTAFEFGNAGAWLTPLFKVGKFLSAEEYQQKIIPVWK 

MPSSTDRAMRIRLLQQMEQFIQYLDEPTVNTQIPPHWHGFLDT 

NPAIREQTVKSMLLLAPKIJTEANrjmLhmfFAiUXJAia^ 

RCNTTVCLGKIGSYLSASTRHRVIiTSAFSRATRDPFAPSRVAGV 

USFAATHNLYSIdNDCAQKIDPVLCGLTVDPEKSVRDQAFKAIRS 

FLSKIiESVSEDPTQLEEVEKDVHAASSPGMGGAAASWAGWAVTG 

VSSLTS KL I RSHPTTAPTETNIPQRPTPBGVPAPAPTPVPATPT 

TSGHWETQEEDKDTAEDSSTADRWDDEDWGSIiEQEAESVLAQQD 

DWSTGGQVSRASQVS\TPTTNPPNPQSPTGAAGK\RGLLGTGLA 

GAKLPGATS *RYTAGQRV 


S7X0 


1 


562 


IPGSTISCEVELMARMAKTIDSFTQNQTRLWIITCIJiyiCEQDK 
VLQMLDTVRVIiFSKGPFlAIFASDPHIIIKAIK<»ILNSVPSGFK 
\LNGHDYMRNIVHLPVFLNSRGL/RQ/IiQENFS*LQCX3MBTPHA 
QILQGYRKKIiTEEFHRTALGR*QNLVARQPSlDG*DAIGFELYV 
CIAIQFimJKDDAT 


5711 " 


1526 


1130 


RRHPFQWTTVTQEAPSHHDVAFTSTPVLFYPDSAQPFIVKSESS 
SQIAKAVr»SQQRPSLFHECAFHPFS*SLQRHTINl,DQGIP*LI>M 
I*SEBRQHLFESS/IWTTPHWLK*/FEIHEHLGSHEGHWTIiFFLIi 
QIL 


S712 


3 


1391 


GRKLFQSbDISERLKFbLTbDCVDDTLI VLAEEHGCIiDII KELP 
ETVIDI.r.NKCLTFHPSKRPTPDELMKDKVFSEVSPLYTPPTKPA 
SLFSSSUICADLTLPEDI SQLCKDINN0YIAERS lEE WYLWCL 
AGGDLEKELVNKEI IRSKPPI CTLPNFLFEDGES PGQGRDRSS/ 
TFR*YHWDIWMPAKK*IERCWGRSILPITI.KMTSLIi:iPYSNSN 
NELSAAATIiPI, IIREKDTBYQI*NRI ILFDRI^KAYPYKKNQIWK 
EARVDIPPIJIRGLTWAALLGVEGAIHAKYDAIDKDTPIPTDRQI ^ 
E\n5IPRCHQYDELLSSPEGHAKFRRVLKAWVV$HPDIiVYWQGLD 
SliCAPFLYLNFMNEALVYACMSAFX PK YLY WFFLKDWSHVI QEY 
IiTVFSQMIAFHDPEI.SNHrOTIGPIPDLyAIPWFLTMFTHVFPL 
HK1FHLW\DTLLLGEPLPPILYWE 


5713 " 




284 


PVCAVPVDRWPV1.PREDQEGQQL*AKLPRDFRR*FQILGPMEGH 
TACRCSRRGAQVQHLPREDIRAAE*DPH1jREVWPGLPTSSATSP 
♦RAVLTSPCSHLGSADAASSHWIiCGVSFH 


S714 


212 


613 


WGLGW3PTMSSLGGGSQDAGGSSSSSTN6SGGSGSSGPKAGAAD" 

KSAWAAAAPASVADDTPPPERRNKSGIISEPLNKSLRRSRPLS 

HYSSFGSSGGSGGGSMMGGESADKATAAAAAASLIANGHDLAAA 
MA 




131 


1979 


ESASQQKRSKCLILTLKLELSGSAPKKTSARPG3SLWLPPHSQE 
QTPPASKLQGGGGGLQraWGLHP VPVTAAS Pr»PRWCLFGAVAJC\ 
GI,PGP*LCPSGAA/GGI.QRGPGLSPLGAAGKVSCI,HPPSMVENN 
DSTCHEHHEGILAARVTPVP\SGKPGRVIJCPPGRVCRPPHPAAS 
PRPPGS/ SDLDGPRPQMFILRAFPAAHGGPVNTPHGGEEiCTFMSS 
QIRRKETKPL*RKTPAG\NNYQSNSIPVSQSPQLTVI>LLPSAGR 
TQAPSGRGDAGKPTPGHG \LPKAS VXLTPNCPCSLAGGQ * PPGL 
YPKTPKQRRWRRPL/LLGPSQ*GSRQSTC+EV\GALGBPVRIPG 
Ii*PDLSCILSNGSKHRREGX.SFPRSU3PGRRGPAGLQSI/3CSPT 
PKK17\CHSSGITVALQAGHDSARDVGSGHVALQAGHDSTQDVGRP 
VWRWIPI,E*i:.GLSRETGQATRRGI,VWISPGRAAAACVACAQALE 
EGPriRLPGQDRGAQPCSHCPGRAAGQPEPGAGAPCRE/GG*DPT 
GI*T/GVPGTDPKRGGRKPGQSGQETQGPTVWSGPESPLQPKP*E 
RQE /VGAGASSGVGLS RGRAGGPSSAWBVAAMLLLUIHGSHSEL 
TDLTEAQTSQH 


5716 


1711 


1370 


RVFSLLCEGPGHCYQGAVCRBACAAASPGLDSAAEPHRLCEHT0 
*riPK*GPGYIQHFHCDSNlLCIX*YNISFNIiFSYSF*GVARYAC* 
RCPIiVL*SGFFTIIVGGYSCCMPLKT 


5717 


44 


1489 


liPTEALRESEWVSEYGKCXSPRGLVPEGESTSPLPSSVDTEDSLD 
EGPGALVLESDLtUSQDLEFEEEEEEEEGDGNSDQLMGPERDsrf 
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SEQ 
ID 
KO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acia segment containing signal peptxdi^ 

(A=Alanxne, c=Cysteine, D=Aspartic Acid E- 

Glutamic Acid, F-Phenylalanine, G-Glycine 

H=Histidine, I=^lsoleucine, K-Lysine, 

l>=ijeucine< M^Mpii-'ni f^n-* m • 
M^uw^ifti^, » »— I its tnionme, N~Aaparaga.ne 

P=Proline, Q=Glutamine, R==Arginine, 

SaSerine, T)=Threonin*» ' 
' •***••*- '^'-'lixiic / v— valine/ 

W=Tryptophan, Y^Tyrosine, X«:Unlcnown, *=.stop 

Codon, /^possible nucleotide deletion, 

\-possible nucleotide t n<IO'»-^ ii^wii ' 


S718 






UU^LGARPGi^FYGLSBDESGGGRALSAESEVEEPARGPGaARCE— 
RPGPACQLCGGPTGEGPCCX3AGGPGGGPLLPPRLLYSCRLC7IW 
SHYSSHLKRHMQTHSGEKPFRCGRCPYASAOLVNLTRHTRTHTG 
EKPYRCPHCPFACSSLGNLRRHQRTHAGPPTPPCPTCGFRCCTP 
RPARPPSPTEQEGAVPRRPEOALLLPDLSLIT/PPGGASFLPDCG 
Q\CGVKGRASAGLDQNHCQS/SLFPWTCRGCGQELEKGEGSRLG 
AAMCGRCWRGEAGGGASGGPQGPSDKGFACSLCPFATHYPNHLA 
KHMKTHSGEKPFRCARCPYASAHLDWLKRHQRVHTGBKPYKCPL 
^^^'-^^"ijAiNijiuaiOKlHSGDKPFRCStiCOTSQiOSMNLIRHM 


5719 


120 


284 


vahalslpaksxundvsmthpqlpptqlawdlcrtclplsvnft 
s**stadplhl 


5720 


48 


428 


x-ijiJWbrVhitMl^ijCWtatiKliAVTGSWADRSPLHEAASQGRLLALRTi;'* 

LSQGYNVNAVTLDHVTPLHEACLGDHVACARTLLEAGANVNAIT 

IDGVTPLFNACSQGSPSCAELLLEYGAKAQP\ESCLPSP 


S721 


1 


1051 


LQAFRNASfcVPMVLVGTQDAXSAA^NPRVYRkTSRARKIiSTDLK"" 

\RCT\YYE\TCGGTYGLQMMSVSFQDVAQKWAIi\RKKQQ\lAI 

GPCKXSLPNXSPSHNgAVSAASIPARAPINOGHE/SGGGSAFSD 

YXSSSVPSTPSISQRELRIETIAASSTPTPIRKQSKRRSNIFTS 

RKGADP\DREKKAAGCKVDSIGSGRAIPIKQGIJ.liKRS6KSLNK 

EWKKKYVTLCDI7GLl.TYHPSriHDYMQNrHGKEIDX,LRTTVKVPG 

KRLPRATPATAPGTSPRANGJUSVERSNTQLGGGTGAPHSASSAS 

^I^SERPl^SSAWAGPRPEGLHQRSCSVSSABQWSEATTSLPPGM 


5722 


97 


492 


RHSSPGCSLKKTERSSNAAVST/'TTVQQFKRFlENyRRHIGCVA 
VFYAIAGGLFLERAYYYAFAAHHTGITDTTRVGIILSRGTAASI 
SFMFSYXLLTMCRNLITFLRETFLNRYVPFDAAVDFHRT.TAQTfl 




83 


1043 


VALDVLAGSSPGGGMAGALLGPRVHGIRAVLRVARGGt/QAPGAP 

GSLGVSHAAAPPARPQGAAQSPHRGRRKGGGGAGLPPPRSPRPP 

QESVPASTSTARGPRRVSRRLPPQHPGPRGRRRRPGAGVGAPRR ' 

GRARGQAGLLGRQGQGGRGAERERAALQARRGRRPGPEPDQSCG 

GRPRRAAAAPGRAPADPQPPAPRPAPAPDVRPPADAPAPAPAPA 

PPPPPHLGAr,TAaSGEERQSQPRAETI.RX.GRGAPIiP\PRAERGG 

RPKQAEQQQ\E>KRPTPPARGPQSSGDPAMLPQRAGLRTCGLAGT 

KSSTREIPEMI 


5723 
5724 


88 


1043 


VALDVLAQSSFUGGrmGALlX^PRVHGlfeAVIJlVARGGVQAPGAr" 

GSI^VSHAAAPPARPQCSAAQSPHRGRRHGGGGAGLPPPRSPRFP 

QESVPASTSTARGPRRVSRRLPPQHPGPRGRRRRPGAGVGAPRR 

GRARGQAGIiGRQGQGGRGAERERAALQARRGRRPGPEPDQSCG 

GRPRRAAAAPGRAPADPQPPAPRPAPAPDVRPPADAPAPAPAPA 

PPPPPHLaALTAGSGEBRQSQPRAETLRlKSRGAPLPXPRAERGG 

rpkqaeqqqXpkrptppargpqssgdpamlpqraglrtgglagt 
ksstreipemi 


5725 


3 


1841 

■ 


Fl-NEAPPAPLPDASASPLSPHRRAKSLDRRSrEPSVTPDU^FK 

kgwltkqyedgqwkkhwfaiadqslryyrdsvaeeaadldgeid 
lsacydvteypvqrnygfqihtkegeftlsamtsqirrnmiqti 

MKHVHPTrAPDVTSSLPEEKNKSSCSFETCPRPTBKQEAEXGEP 
DPEQKRSRARE\RRREGRSKTFDWAEFRPIC2QALAQERVaaVGP 
ADra\DPWRPEAEHGELERERARRREERRKRraMLDATOGPGTE 
DAALRMEVDRSPGI.PMSDr.KTHKrVHVEIEQRWHQVETTPLREEK 
QVPIAPVHLSSEDGGDRLSTHELTSIiLEKELEQSQKEASDUiEQ 
NRLU3DQLRVALGREQSAREGYVL0ATCERGPAAMEETHQKKIE 
DLQRQHQRELEKLREEKDRLLAEETAATISAIEAMKNAHREEME 
RELEKSQRSQISSVWSDVEALRRQYLEELQSVQRELEVI^EQYS 
SKCIJSaSTAHIAQALEAERQALRQCQRENQELNAHNQELWNRIiA?^ 
rTRtiRTLLTGDGGGEATGSPIAQGKDAYELEVPSGARPCLTQLC 
rQEPQGSAAWPLSYRWGGTDLRQQESQGPGRSKSPEGGEEQ 




3 


1049 ^^ 
t 

1 

I 


/ttGHSEETSQSPNRTEPHDSDCSVDLGISKST3DLSPQKSGPVG 
5WKSHSITNMEIGGLKXYDILSDN\DLSSHLQPLK/FTSAVDG 
CWrVRSKAATLLYDQPLQVFTGSSSSSDLISGTKAIFKPDSNHN* 
^E/GAKYNKRPHKWAHNIiHLKYMVIJIS 1 ISNTVAV\RSQRHFV2? 



364 



NSDCCID: <WO 0153312A1 .( > 



wo 01/53312 



PCT/USOO/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue o£ 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, 
Glutamic Acid, F=Phenylalanine, G-Glycine^ 
H-Histidine^ I«Ieoleucine, K-Lysine, 
L»Leucine, M^Kethionine, N»Asparagine, 
P=Proline , Q-Glutamine , R^Arginine , 
S=Serine, T=Threonine, V=Valine, 
W=^Tryptophan, Y^^Tyrosine, Xs^Unknown, *=Stop^ 
Codon, /=posslble nucleotide deletion^ 
\=possible nucleotide insertion) 








LQTKSPNRPCQFSSSAPS/VDQRAQ/INQSYAKHSANMNFSNHN" ' 

nvrantayhlhqrlgparhgemwaispndrlipavtrstiqrSs 

SVSSTASVNLGDPGSTRRAQIPEGDYLSYREFHSAGRTPPMMPG 

SORPIiSARTYSIDGPNASRPQSARPSINElPERTMSVSDFNYSR 
TSP 


5726 


2 


486 


1 SRSI^>IWWNSGbPASSHSSKliPVTVGFSGCVKRLRI>HGRPLQAP • 
\ TRMAGVTPCILGPLEAGLFFPGSGGVITL/ESVGAGIPGPSRAG 
I QGSPGGSGEGPPLSSPSQPLPADLPGATltPDVGLELEVRPLiAVT 
GLI FHLGQARTPPYLQLQVTEKQVLLRADDG 


5727 


21 


221 


RP ILI LKETRRIiPWATGYAEVINAGKSTBNEDQASCEVLTVKKK 
AGAVTSTPNRWSSKRRSSliPNGE 


5728 


2 


877 


GTRNGQFEPRRGRAWEGSAGGLRAPGAAAGGPGVQPRGSG /l,PG " 

NAIRAGVWPGRGPASPFWDLSLPWDLWPPPTDHAPGAPDFPAVE 

GR\PWAGGRPPWPVSGVLGSRVCGPLYSTSPAGPG/SGGLSPSQ 

GGPAGAGGDAG/LPGRCPSAPWRAGSRPJWSCPDWIPGPQGLWL 

HRN PTS/GPPSQ IGBGAEQGDEX3 VADAPQ IQCKN/GAEDPPAED 

EPPQVPEAGEEDAVPABEGPGGTPETQADQVRERPEAHIiAEGGA 

KGSPRKIiADPQDLPAGQMSIiAPPFPPVAAVIRSMK 


5729 


1 


1525 


AGGARSVLTLULGHPAGFVGAHWWWQQDAAlARATDSKfiPPG'm^ 

CPDVLyRTGRTLHGQETYTPRLIU^IDLKSSLSSLKEEGGLyRDK 

QLDAAIAWQGKLTTHKEELYPKNPYLQDFLSAEGVLSSDGVMRV 

KS I PNGKGSS PLPTATTPKPLIPTEAS r RVWSDFLRVHmPRSI 

CMIQKYNHDGEAGRLEAFGQGESVLKEPKYQBELEDRIJiFTVEE 

CXlYLQGPQILCDLHDGFSGVGAKAAEIiriQDEySGRGIITWGLIiP 

GPyHRGEAQRKIYT^rjJNTAFGLVHLTAHSSLVCPliSlXSGSLGLR 

PEPP VS FPYLHYDATLPFHCS AILATAU7TVTCS\ YRLCSS PVS 

MVHLXADMLSFCGKKWTAGAIIPPPLAPGQSLPDSLMQPGGAT 

PWTPLSACXSEPSGTRCFAQSWLRGIDRACHTSQIiTPGTPPPSA , 

LHACTTGEEIIAQYLQQQQPGVMSSSHliLLTPCRVAPPYPHLFS 

SCSPPGMVLDqSPKGAAVESVPVFG 


5730 


12S8 


1713 


KKFQAPARETCVECQKTVYPMERLLANQQVFHlSCFRCSYCHNk ■' ■ 
liSLGTYASIiHGRIYCKPHFNQIiFKSKGNYDEGFGHRPHKDLWAT 
KIETEGFWERPRNFENCGRPLKSPGGEDCPSC*GGCPGSNY*AQ 
GSSSREKGGQASWNPKLRVA 


5731 


122 


443 


RSHRGEbIPKDSCYMRKPPRRPKKRRQG/CM.PQGCLTFKDVAr 
EFSLEEWKdiNPAQRAI/YRAVMLENYRNLESVGI^TSKDSWYMRK 
KPGRGRGKQRRQBWFFIJRVY 


5732 


226 


772 


PPSRSCQSPRRKSRRRAHVTVTIiVCGFTSFSFSLPliYIiCXSCliRF 
PERTCSQIiQQAPWAPDFGPSSFVPSWGATAlGARKFLIAFNiXN 
U/GTKEOAHRIALNIJiEQGRGKDQPGRLXKVQGIGWYLDEKNLA 
QVSTNLl^DFEVTALHTVYEETCREAQELSLPWGSQIiVGLVPLK 
ALLDAA 


57i3 


1 




PALQEVWA>IAI^GKQYENDARTLFBFTSGVNDTEiSPIIYRDES 
MRTACSPPGIiCSDGNGDEIiKGPFTSRDFMKFRLGGFEAlKSAYM 
AQVQYSMWVTRKNAWYPT^NYDPRMKREGIiHYVVIERDEKYMVAS 
FDEI\VP\EFIGKMDEVLSRDPM 


5734 


3 




KLJx^>i:'j:*oijioijuvjuiti i, ANJNljr vljIPAjrSKNRAyAIF?XVFTVr 
GSLFLMNLDTAI I YSQFRG YLMKSLQTSLFRRRLGTRAAFEVI^ 
SMVGEGGAFPQAVGVKPQNIiLQVLQKVQLDSSHKQAMHEKVRSY 
GSVLLSAEEPQKIiFNELDRSWKEHPPRPEYQSPFIiQSAQFLFG 
HYYFD YliGNLIAIiANLVS ICVFLVLDADVIiPAERDDFILGlLNC 
VFIVYYLLEMLLKVFAIiGLRGYLSYPSNVFDGLLTVVIiLVLEIS 
TIiXVCTDCHTQAGGRRWW/RLLSLWDMTRMLNMLIVFRFI^IIP 
SMKPMAVVASTVLGL 


5735 


2 


540 


FFTPCVARAb'NFPDQATVKKAAYSLPRVGGGTSOGIiPQARRISL 
ATPRQLYK/SStJMTQRWQRREISNFEYLMFLNTIAGRTYNDIiNQ 
YPVFPWVLTNYESEELDLTLPaNFRDLSKPlGALNPKRAVFYAE 
RYETWEDDQSPPYHYWTHYSTATSTLSWIiVRlVS I F lELACLWY 
LKILT ^ 


5736 


1 


382 


GTRPSTKKSCiySPQQVAVIHCKGHQKENTAVAHSNQKAbSAAQ/ ■ 
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~"SEQ j Predicted 
ID I beginning 
NO: I nucleotide 
location 
i corresponding 
to first 
arr.ino acid 
residue of 
amino acid 
sequence 



5737 



290 



5738 



S739 



5740 



265 



Predicted enH" 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



1041 



1222 



231 



S743 



5744 



S74S 



650 



362 



703 



599 



5746 



5747 



821 



1328 



Ammo acid segment Containing signal peptTd¥ 
(A=Alanine, C-Cysteine, D^Aspartic Acid 
Glutamic Acid, F= Phenylalanine, G^Glyctne 
H=Histidine, I^Isoleucine, K=Lysine, 
I,=Leucine, M=Methionine, W^Asparagine 
P=.Proline, Q^Glutamxne, R=xArginine, 
S-Serine. T^Threonine, V=Valine, 
W-Tryptophan, Y^Tyrosine, X-Unknown, -^=stop 
Codon, /-possible nucleotide deletion, 
X^possible nucleotide insertion) 



. ^ ^.v*v-^wv/^j.vAg x jijsercjLonj 

QES**ILPDSGIFIP^T^TSYLQSTTH LRRAKLPOLLRr^ 
KAUbHLLSSFLTSNFLFNPLLPD SLYSVfiARij^KAMLOPCRRKR 



.-™**^*,o«i-4jAoiNrijriMfi^x.fUbtjybVEARSQRANLGPCRRK 
LQTU4RLAAGFQYSSHKDPSLSAKEKKTDYHNEARGPWPGWVG* 
RTADGSCGRGPDGAHHPGPKSSSWRASRLLPGLGGSHHLDAYVG 
RDLECGTPAPLQLEIPPQPRGHPAPIPTGQAGPRDSGPGASP*V 
ETRPLTDGRR*PGVRPVGWTPAHPAGTLRPRGAVEPSVSACGKW 
APSPTSQGCCEGRCDAVPKHRAWRTPLCSQ 

iJTI.St*NCTLPi£TLPMTPSP*LSFL*PPGlARAKSIPTKTySNBV 



— ^w*,*^^* ij"rfUJUAKrtfti>XfTKTlSNBV 

VTLWYRPPDILLaSTDYSTQIDMW*GOVEVI70GPCGKGGGLVTT 
ATQPAAFLPTVPSbPRGVGCIFYEMATGRPIiPPGSTVEEQriHFl 
FRILSEEAWALCAVET HR 

SFQRRGjRWNVHTIJ{PHPRAVWAGIGRGH GiJ*AUiGRARAPAi;c ' 

FPTLLEPLESLEPDLPALRAMGLHIiWAAGPGTHPAGXSDIilAEV 

SAEVDGPVPGYLSSPQSITDTCLYIPTSGTTGLPKAARISHLKI 

LQCQGFYQLCGVHQEDVIYLALPLYHMSGSU/SIVGCMaiGATV 

VLKSKFSAGQFWEDCQQHRVTVFQYIGELCRYLVNQPPSKAERG 

HKVRLAVGSGLRPDTWERFVRRFGPLOVLETYGLTEGKVATINY 

TGQRGAVGRASWbYKHIFPPSLIRYDVTTGEPIRDPQGHCMATS 

PGEPGIoLVAPVSQQSPFLGYAGGPELAQGKLLKDVPRPGDVFFN 

TRDLIiVCDDOGFLRFHDRTGDPFRWKGENVATTEVABVFEALDF 
LQEVNTVYGVTV --r^w^^t 



PAYWr.KVPTl.CI.ESKTDLRSKASHVSAQLgGKVRGUlGAbWM*A 
YVyERVYN*NISRMVHAI.EQKRHPAGLSSSMALQI^PCIy3MLMA 
j^SEIiHKLYDEETQSWVSQSACG GYP 

^PRKTMRRGVl.MTLi:iQQSAMTI.PLWlGK PQDRPPPI,QGAl PASGD " 
YVARPGDKVAARVKAVDGDEQWIIAEWSYSHATNKYEVDDIDE 
EGKERHTLSRRRVIPbPQWKANPETDPEALFQKEQLVIALypQT 
TCFYRALIHAPPQRPQDDySVLPEDTSYADGYSPPLNVAQRYW 
ACKEPKKK*CRLADSPSPm)TGQDSRG RAGIKHIPPIiKKK 
xgsVKEILKRNPWVNLTDKD GMTALMIASKEGHTKIVQDLLDAG 
TYVNIPDRSGDTVLIGAVRGGHVEIVRALLQKYADIDIRGQDNK 
TAIiYWAVBKGNATMVRDlLQCNPDTEICTKP G 

GKTPEGIDAlEEIEIDLEKTEREISPQ feNGLEEVKPLGEMQTDt " 

KATGREISPREKTPEVIDATESIDKDLEETGRREISPEENGPEE 

VKPVDEMETDLKTTGREGSSREKTREVrDAABVlETDLEETERE 
ISPQE 

TRRTTTTSPTTTRQMTTTPAALPTTV VTTPDLTTGTPLQMrTIA 

VFTTANTCLSr,TPSTLPEKATGZ,i;/TPEPSK:EGPILTAESETVLP 

SDSWSSAESTSADTVLLTSKESKVWDLPSTSHVSMWKTSDSVSS 

PQPGASDTAVPEQNKTTKTGQMDGIPMSMKNEMPISQLLMUAP 

SrXSFVLFALFVAFLLRGKLMETYCSQKHTRI^yiGDSKNVIJTDV 
QHGREPEPGLFTL 

GKSRFVNLMKHSKKTYDSFQDELBDyiKVyKARGLEPKTCPRKM" 

KGDYLETCGYKGEVNSRPTYRMFOQRIiPSETIQTYPRSCNIPQT 

VENRLPQWLPAHDSRIjRLDSLSYCQFTRDCFSEKPVPLNFNQQE 

VrC33SHGVEHRVYKHF5SDNSTSTHQASHKQIHQKRKRHPBEGR 

EKSEEERSKHKRKKSCEEIDLDKHKSIQRKKTEVEIETVHVSTB 

KLKNRKEKKSRDWSKKEERKRTKXKKEQGQERTEEEMLWDQS I 
LGF 



SrASGRLTPSSPAFDGELDIKJRYSNGPAVSAWSLGMGAVSWSES" 

RAGERRFPCPVCGKRFRFNSILALHLRTHQPERPRSPAARLLLE 

LEERALLREARLGRARSSGGMQATPATEGLARPQAPSSSAPRCP 

yCKGKFRTSAERERHLHIt,HRPWKCGLCSFGSSQEEELLHHSLT 

AHGAPERPLAATSAAPPPQPQPQPPPQPEPRSVPQPEPEPQPER 

EATPTPAPAAPEEPPAPPEFRCQVCX3QSFTQSMPLKGHMRKHKA 
SPDHACPV 



DRHVETLClHFLGPSTGSTAKTGGRNWLKTGNCLYGNTCRFVHG 
PSPRGKGYSSNYRRSPERPTGDLRERtKKKRQDVDTEPQKRNCT 
ESSSPVRKES5RGRHREKEPIKITKERTPESEEENVEMETNRDD 
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SEQ 
ID 
NO: 


Preclicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
eetjuence 


Ammo acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F==Phenylalanine, C3-Glycine, 
H»=Histldine, I=Isoleucine, K=Lysine, 
*j uc^ui-^iic , n— ne nnionxne , N=sAsparagine 
P=Proline, Q=Glutamine, R=tArginine, 
SaSerine, T=Threonine, V.= Valine, 
W-Tryptophan, Y«Tyro3inc, X-Unknown, *=stop 

/ wj^oaaiDxe nucleotide deletxon, 
\-possible nucleotide insertion) 








SDNGDINTfI?YVHELSLEMKRQKIQRELMKLEQENMEKSlElTlir* 
KBVSPEWRSKLSPSPSLRKSSKSPKRKSSPKSSSASKKDRWTS 
AVSSPLLDQQRMSKTNQSKKKGPRTPSPPPPIPEDIALGKKYKE 
JM A. V ivxJK i KUGKDRGRDFERQREKRDKPRS TS PAGQHHS P 

ISSRHHSSSSQSGSSIQRHSPSPRRKRTPSPSYQRTLTPPLRRS 

ASPYPSHSLSSPQRKQSPPRHRSPMREKGRHDHBRTSQSHDRRH 

ERREDTRGKRDREKDSREEREYEQDQSSSRDHRDDREPRDGRDR 
RE 


S749 


934 


473 


SiKGPOVFYKCilAPTLIAIFPYAGLQPSCYSSLKHtVKWAIPAEG 
KKNENLQNLLCGSGAGVISIOTTYPLDLFKKRLQVGGPEHARAA 
FGQVRRYKGLMDCAKQVLQKEGALGFFKGI^SPSLLKAAIiSTGFM 
FFSYEPFCNVPHCMNRTASQR 


57S0 


552 


a 


GFPVDPRVRGSTLSLAERPKGMIRSGSFRDPTDDVHGSVLSLAS 

SASSTYSSAEERMQSEQIRKLRREliESSQEKVATLTSQLSANAN 

LVAAFEQSLVNMTSRLRHIASTAEEKDTEI.LIX[.kETrDFLKKKN 

SEAQAVIQGALNASETTPKELRIKRQNSSDSISSLNSITSHSSl 
GSSKDADA 


5751 


22 


866 


IFISICLWKAHLCFrJ^PKDCIDQVMKJXJ^TLPVDDSGRYIAIQF 
£ILEWAYVPLYYYEYRKAKDQLDIAKDISQLQIDLTGAI/3KRTRF 
QENYVAQLIIiDVRREGDVL^NCEFTPAPTPQBHLTKNLELNDDT 
ILNDIKIiADCEQFQ^1PDLCAEEIAIILGICrmFQKNNPVHTLTE 
VELIAFTSCUCiSQPKFKAIQTSALlLRTKLEKGSTRRVERAMRQ 
TQALADQPEDKTTSVLERLKIFVCCQVPPHWAIQRQLASIXPEL 
GCTSSALQIFEKI*EMWE 


r 5752 


3 


751 


SCGSALRAWRCGAAAIATFPAPALPGLMyRALYAFRSAEPNALA 
PAAGETFLVLERSSAHWWLAARARSGETGYVPPAYLRIOiQGLEQ 
DVLQAIDRAIEAVHNTAMRDGGKySLEQRGVLQKbrHHRKETZ,S 
KKtaf5ASSVAVMTSSTSDHHI»DAAAARQPNGVCRAGF*ERQHSIiP 
SSEHLGADGGIiFQIPLPSSQIPPQPRRAAPTTPPPPVKRRDREA ' n 
liMASGSGGHNTMPSGGNS VS SGSS VSSCI 




3 


471 


UPVCGVGLSVAWAG ? : RGPVHS V^GGGRAALHGAELPCi^SGAAT 
VEREMBDRHKNEMLRVETEARARAKAERENfADI IREQIRLKASE 
HRQTVLESIRTAGTriFGEGPRAFVTDRDKVTATVNrFIKQGWQV 
AERQHVGASWS PRSCPCRLCTAIi 


S7S3 


34 


483 


i-\aruf V v/i\r r V KIN Ji y Ti'KTGHR XRKIJX?! 
GQEAFKKLNYLDIGEIKKRPMEWNTEVKPVIHSRINVSARFRK 
PLQEPCri FLIANGDLXNPASRrXIPRICTLNQWDltVl.QMVTBKr 
TIiRSGAVHRLVTLEGRr^V 


j 5754 


14 


331 


■ri.VHVV£:FAGEHAEAIASREQEVI^WKEIjI>SACEDARI,HVSST~ 
WPTPATPSPLTAPFSME 


575S 
5756 


3 


838 


LGDQF YKEAIEHCRS YNSRLCAERSVRLPFLDSQTG VAQNNCY f~ 

WMEFCRHRGPGrAPGQLYTYPARCWRKiCRRLHPPEDPKLRLI.EIK 
PEVELPLKKDGFTSESTTIjEAItI*HGPnvPV'ifunaDT?i?i?evrt«>Trt 

RVLEWDENVEEGNEEEDLEEDIPKRKNRTRGRARGSAGGRRRHD 
AASQEDHDKPYVCDICGKRYKNRPGIiSYHYAHTHLASEEGDEAQ 
DQBTRSPPNHRNENHRPQKGPDGWIPJMNYCDPCI/SGSNMNKKS 
GRPEEIiVSCADCGRSAHLGGEGRKEKEAAA 


5757 


3 


621 


SSKUJALFAHPLYNVPEEPPLUSAEDSLLASQEAliRYYRRKVAR 
WNRRHKMYREQMNLTSLDPPLQLRXJEASWVQPHLGINRHGLYSR 
SSPWSKLLQDMRHFPTISADYSQDEKALLGACDCTQIVKPSGV 
HLKLVLRPSDPGKAMFKPMRQQRDEETPVBFFYFIDFQRHNAEr 
RAFHLDRILDPRRVPPTVGRIVNVTKEIL 


S7S8 


3 


473 

] 

J 


yKDAIiliLPDNHRQWFENGTIiKLTDVQKGMDEGEYI^CSVLIQ^Q 
C»S ISQS VHVAVKVPPLIQPFEFPPASIGQr.I.YI PCWSSGDMPI 
^ITWRiOXSQVI ISGSGVTI ESKEFMSSLQIS S VSLKHNGNYTCI 
iSNAAATVSRERQLIVRVPPRFW 




1 


474 


?'RRGAGAERGEHREGERGAAGMGEFKVHRVRFFNYVPSGIRCVA 
^NNQSNRIAVSRTDGTVEIYNLSANYFQEKFFPGHESRATSALC 
^AEGQRLFSAGLKGEIMEYDLQALNIKYAMDAFGGPIWSMAA^ | 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of . 
amino acid 
eequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amj.no acxd segment containing signal peptide"" 
CA=Alanine, CxzCysteine, D*Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^^Glycine, 
H«Histidirie, I-Isoleucine, K=Lysine, 
L=rrieucine, M=rMethionine, NstAsparagine, 
P^Proline, Q^Glutamine, R=Arginine, 
SsSerine, Ts=Threonine , V«Valine 
W^Tryptophan, Y^Tyrosine, X«Unknovm, *-Stop. 
Codon, /=possible nucleotide deletion, 
\=po3J3ible nucleotide insertion) 


57S9 


2 


1240 


SGSQLLVGCEDGSVKLFQITPDKIPV ~ ™~ 

GNAAFAGQGVVYETFHMSDLPSYTTWGTVHVVVNNQIGFTTDf»R 
MARSSPyPTDVARWNAPIFHVUADDPEAVrYVCSVAAEWRNTP 
NKDVGADLVCYRRRGHNTEMDEPMFTQPLMYKQIHRQVPVIiKKYA 
DKLXAEGTVTLQEFEEEIAKYDRICBEAYGRSKDKKimiKHWL 
DSPWPGFFNVI}aEPKS&!TCPATGIPEDMLTHIGSVASSV?LEDF 
ivi n L\sus> K. X Jj«\JK/UJrTrKNKT VDWALTVE YMAPGSLLKEG I HVRIi 
NGQDVERGTFSHRKHVLHDQEVDRRTCVPMNHLWPDQAPYTVCN 
SSLSEYGVLGFELGYAt4ASPNALVLWEAQFGDPHNTAQCIIDQF 
ISTGQAKWVRHWGXVLLLPHGMEGMGPEHSSARPERFLQMSNDD 
SDAYPAFTKDF2VSQIi 


576C 


1 


1221 


VRDITSDSLSLSMTVPEGQFDKFLVQFKNGDGQPKAVRVPGHED 
GVTlSGLEPDHKYKMNLYGFHGGQRVGPVSAVGLTAPGiOJEEMA 
PASTEPPTPEP?IKPRIiEELTVTDATPDSI»SI^WTVPEGQFDHF 
LVQYKNGDGQPKATRVPGHEDRVTISGLEPDNKYKMNLYGPHGG 
CRVGPVSAIGVTAAEEETPTPTEPSMEAPEPPEEPLLGELTVTC 
SSPDSliSLSWTVPQGRFDSFTVQYKDRDGRPQWRVGGEESEVT 
VGGLEPGRKYKT^HIiYGLHEGRRVGPVSTVGVTAPQEDVDETPSP 
TEPGTEAPBPPEEPI4LOELTVTGSSPD3LSLSWTVPQGRFDSFT 
vy I Al^KUt»KPQAVRVGGQESKVTVRGIjEPGRKyXMHIjYGI»HBGIl 
RLGPVSAIGVT 


B761 


3 


1275 


SCDMAEAAALVWIR3PGFGCKAVRCASGRCTVRDfe'tHRHCQDQM 
VPVENFFVKCNGALINTSDTVQHGAVYSLEPRLCGGKGGFGSML 
tuujUAu i *s ft. 1 im Ki£ACRDl*SGRRLRD vNHEKAMAEVfVKCX)AERE 
AEKEQKRLERI^RKLVEPKHCFTSPDYQQQCHEMABRLEDSVLK 
GMQAASSKMVSAE ISENRKRQWPTKSQTDRGASAGKRRCPWliGM 

EGLETAEGSNSESSDDDSEEAP5TSGMQFHAPKIGSNGVBMAAK 
FPSGSORARWNTnHf5<5PPnTy^TD^/Tne/-lSI3TT «>r\o/^»«>T 

EHMESRt4VTETEBTQBKKAES KEPI EEE PTGAGLNKDKETEERT 
PGERVAEVAPEERENVAVAKLQESQPGNAVIDKETIDLLAFTSV 
AEIiELLGIiEKliKCEIiMALGIiKCGGTLQ 


5762 


2 


344 


GSTGQTPLHiJyGGGGGSGGGRRRTPRGMPKEKYEPPDPRRMYTI 
KSSEEAANGKKSHWAEJbEISGKVRSLSASLMSLTHLTALHLSDN 
SLSRI PSDI AKLHNLVYLDLSSNKIR 


5763 


3 


429 


LDKPTGLIMLIART.nyRT.TAtari'T'T.T'TTftp 

LDVNDNVPTPQKDAYVGALRENEPSVTQLVRLRATDEDSPPNNQ 
ITYSIVSASAFGSYFDISLYEGYGVISVSRPIiDYEQISNGLIYL 
TVMAMDAGNT 


S764 
576S 


19 


441 


VCARACGEMHQLLRPIDRQRYDENEDLSDVEEIVSVRGFSLEEK' 
LRSQLYQGDFVHAMEGKDFNYEYVQREAItRVPLIPHElCDGLGIK 
MPDPDFTVRDVKLLVGSRRLVDVMDVNTQKGTSMSMSQFVRYYE 
TPBAQRDKL 


" 576^' 


3 


825 


QKILRLNNSHQPPTSSSN^SKDCGGPASSGAGAtAALAD^LKPAq — 
VQASAPQGNSHKETSKSKVKRS KTSKDANKSL?SAALyGI PEIS 
STGKRQEVQGRPGEATGMNSALGQSVSSGGSGNPNSNSTSTSTS 
AATAGAGSCGKSKEEKPGKSQSSRGAKRDKDAGKSRKDKHDLLO 
GHQMGSGSQAPSGGHLYGFGAKSNGaGASPFHOGGTGSGSVaAA 

GBVSKSAPDSGLMGNSMLVKKEEEEEESHRRIKKIiKTEKVDPLF 
TVPAPPPHV 




1608 


663 


SGLFSVDPASSQAMBLSDVTLIEGVGNEVMWAGWVLIIALVL 
AWLSTYVADSGSNQI.LGAIVSAGDTSVLHLGHVDHLVAGQGNPE 
PTELPHPSEGNDEKAEEAGEGRGDSTGBAGAGGGVEPSLEHLUD 
IQGLPICRQAGAGSSSPEAPLRSEDSTCLPPSPGI^ITVRLKFLND 
TBELAVARPEDTVGALKSKYFPGQESQMKLIYQGRLLQDPARTL 
RSLNITDNCVIHCHRSPPGSAVPGPSASLAPSATEPPSLGVNVG 
SLMVPVFWriI^VVWyFRINYRQFFTAPATVSIiVGVTVFFSFI.V 
FGMYGR 


S767 


2 


892 


NFRATPRPPTRPELRTGTEVILWYLDWRALMKRKRMKANIKLVG 
SGFPLPSSDLDDSI.TEEIDE^aGFRNDA^TFDWQNVADFRDAGGS 
LTEVKVEEEBRDPQSPEFEIEEEEEMLSSVIPDSRRENEIiPDFEif 
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SEQ 
ID 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
reoidue of 
amino acid 
sequence 



5768 



Amino ac£d-^i55fiHt containing signal peptidT 
(A^Alanlne, C=.cysteine, D=.Aspartic Acid 
« u^^r-S '^''^'^' ^=^^^'^yl^lanine, G=:Glycine, 
H^Histidine, I=.lsoleucina, K=Lysine, 
L=Leucane. M^Methionine, N-Aeparagine , 
P-Prolxne, Q=Glut amine, R=Arginine, 
S^Serxne, T-Threonine . V^Valine, 
W=Tryptophan, Y^Tyrosine, X«Unknovm. *=stoD 
Codon, /-possible nucleotide deletion 
X^^posslble nucle otide insertion) 

LQVEKERLOlEKERLRHLDMEHERLQLEKERLQIEREKLRUJiV 
NSEKPSLBNEI/3QGBKSMLQPQDIETEKLKI,ERERLQLE^0 
.q^lP.QPr .Q^fQTro nW^ts^ ,-. ^ ■■■ 



5769 



5770 



484 



S77X 



168 



741 



^^K^'Ki^VSVSPPPI> 01Vb.i.0ib>PI:-AWEFCSRLGSAVTSQRAGPA - 
AAhT/AKDYPFYLTVKRANCSLELPPASGPAKDAEEPSNKRVKPL 

WSRNAAPSSrKRRD5KX.WSETFDVC Ai^ur-Kt- 



-.,.-KKGVKEKATDgsVKAFAEHCPEr^yVOFMGCSVTSKGVIHrr 
TKLRNLSSLDlJmiTELDNETAMEIVKRCKNLlSLNLCWWIIN 
DRCVEVIAKEGQNLKELYLVSCKITDYALIAIGRYSMTIETVDV 
GWCKErTDQGATI,IAQSSKSLRYLGIMRCDKVNEVTVEQLVCX)Y 
PHITFSTVLQDCKRTLERAYQMGWTPNMSAASS 

i3SRRYDVKTRKWSFLI>ESHSKI. JAKVRCI.P0VQLUPI.PTTLTIA" 



* « V j^^nz>r i4XjjiiswiiiU,JlAKVRCLPQVQLUPjLPTTLTIiA 

FASQLKKTSLSLTPDVPEADLSEVDPKLVSNLMPFQRAGVNFTM 
J^GGRLtJ^DMGE^KTIQAICIAAFYRKEWPLLVVVPSSVRFT 
WBQAFLRWLPSLSPDCINVWTGK DRLTA 

m>LPSACLRAKSWREAS£GPSSR A(^NUSUDTFKACYSGTSTI^5 



148 



363 



S773 



5774 



S77S~ 



538 



5776 



945 



5778^ 



1210" 



FHGSHCSGSDHSSLGI.EQLQDYMVTI,RSKLGPI>EIQQPAMLLRE 
YRW5LPIQDYCTGLLKLYGDRRKFLLLGMRPFIPDQDIGYPEGP 

XLTFSNI1VTCSAIYHI.PVPPEREP GCSMRDLRVA 
PRVRSyjiMFCFMKMNTRI.Q VEHPVTEMIlX?'AmVEWQLRIAAGE 
KIPLSQEEITLC2GHAFEARIYAEDPSNNFMPVAGPI,VHLSTPRA 
DPSTRIETGVRQGDEVSVHYDPMIAKLWWAADRQAALTKLRYS 

SRKAAAKESLCQAALGLILKEKAMTDTPTLQAHDQFSPFSSSSG 
RRLWI5YTRNMTLKDGKNSK W^mwf F5SSSG 

t^VEBENIRWRCGGSELNFR RAVFSADSKYlFCVSGDFVKVTST 
VTEECVHIIiHGHRNLVTGIQLNPWNHLQLysCSLDGTIiaWDYl 
DGILIKTFIVGCKLHALFTLAQAEDSVFVIVNKEKPDIFQI^VSV 
KLPKSSSQRVBAKELSFVLDYINQSPKCIAFGNEGVWAAVREP 

gaocUDPAAPSSLAEAATMPVS KCPKKSBSLWKGWDRKAQRNGi; 



RSQVYAVNGDYYVGEWKDNVKHGKGTOVWKKKGAIYEGDWKFGK 
RDGYGTLSX.PDQQTGKCRRVYSGWWKGDKKSGYGIQFEt3PKEyY 
EGDWCGSQRSGWGRMYYSNGDIYEGQWENDKPNGEGMLRLSQNP 



KJ^DCVCQNl^ESI^TLCPiiKGi^FVPPDlDRRTVELREGa^ 

IIIIISRQDPANMTGLVDLTI^RNTISHIQPFSFU)LESLRSIJIL 

DSimLPSI^EDTLRGLVOTiQHLIVKKNQLGGIADEAFEDPLLTL 
EDLPLS YNNLHGPAVGLRGDAWVQPS T S 

GUDPEPGQDIiFQPEREVDP SWGROREPRLGKLRFQNDH LSWjCQ" 
VXKLEQALKDGSAGLDPQLPGTCYSPHCPPDKAEAGSTLPENLG 
^^^!^y^^^^^°''^^^^'^^^^^RJ^<5SCRRPWDRSLENV 

mr^eofo?!''''''''^^'*'*^^^'^^^^QKSSADHRKSYEFE 
vP^:?^f!^!2'''^'^''Q^''^^^^^^2ENVyEDI 



gKRQSVSRLLLPVFLLEPPAEPGL fc.t>PPfc.EEGGb!PAGVAEEPGS 
GGPCWLQLEEVPGPGPLGGGGPLRSPSSYSSDELSPGEPLTSPP 
WAPi:>3APERPEHLi:,N^Vi:,ERLAGGATRDSAASDXr,I,DDlVLTHS 
LFLPTEKFU2ELHQYFVRAGGMEGPEGLGRKQAC3UAMLt^FLDT 
YQGLtiQBEEGAGHllKDI,YLLIMKDESLYQGI,REDTI.RZ,HQLVE 
TVELKIPEENQPPSKQVKPLFRHFRRXDSCLQrRVAFRGSDEIF 
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SEQ 
ID 
NO: 


Paredicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acid segment containing ■Signal p5ptI3in 
(A==AIanine, C=Cysteine, D^Aspartic Acid 
Glutamic Acid, F=Phenylalanine, G-Glvcine 
H^Histidine, I-Isoleucine, K«Ly3ine/ 
L.Leucine, M«Methionina, N=A3paragine, 
P=Proline, Q>=Glutamine, R«Arginine 
S=Serine, T=Threonine. Valine, ' 
W^Tryptophan, Y-Tyrosine, X^Unknown, •=Stop^ 
Codon, /^possible nucleotide deletion, 
\=^posslble nucleotide insertion) 


5779 






^i.F^::t;iQVSPGDTklHkVEPEDVANHLTAFHHJ;;LFRCVH3LEyV- 
DYVFHGE ''^ 


5780 


138 


1571 


EAVQVLIKHSAPVNARDKNWUTPLHVAAANKAVKCAKVIIPLLS 
EVNVSDRGGRTAbHHAAI^GHVEMVNLLIAKGANINAFDKKDRR 
ALHWAAYMGHLDWALLlNHGAEVTCKDiCKGYTPLHAAASNGQI 
NWKHLLNLGVElDEINVYGWTALHIACYNGQDAWNELIDyGA 
NVNQPNNNGFTPLHFAAASTHGALCLELI.VNNGADVNIQSKDOK 
SPLHMTAVHGRFTRSQTLIQNGGEIDCVDKtXSNTPLHVAARYGH 
EIXINTLITSGADTAKCGI^ISMFPLHIAALWAHSDCCRKLIiSSG 
QKYSIVSLFSNEHVLSAGPE1DTPDKFGRTC1.HAAAAGGNVECI 
KLLQSSGADFHKKDKCGRTPLHYAAANCHFHCXBTLVTTGANVN 
ETDDWGRTALHYAAASDMDRNKTILGNAHDNSEELEHARELKEK 
EATLCLSFLLQWDANPS IRDKEGYWS IHYAAAYGHRQCLEUCLE 
RTNSGF3ESDSGATKSPLHLAVSEMP 


5781 


154 


624 


Wi^ r Kv XTUl.FFKGPDYRLYKSEPBLTTVAEVDESNGEEKSEPV;^ 
EIETSWKGSHFPVGWPPRAKSPTPESSTIASYVTLRId'KKMM 
DLRTERPRSAVEOLCIAESTRPRMTVEEQMERIRRHCQACIiREK 
KK<3LirVIGASDQSPLQSPSNLRDWP 


5782 


19 ~ 


^41 


RGSLGGHPWi<^FMRAASQUCbPWSKVTGPHQEkAYti6RGPGGAF" 

PAPPVSGTCPPDLXYAPTPEKAEGGSQKNHQPPPGERAAHRDGB 

QAPCRAGPTRKVAVAPRPPSCP*GPE\PGEEPRRPLDRSPPLGQ 

VQPHFTSQDAKSAEDEAPSRHI/3KHQPRSAQVGSRLDALQGPKT 

QHSIHTVTCKSPRQKEDRSPKPPQAPKHPEEHGRQS\QAPPPr,P 

VAPSRTCGGC*TWDPALLVSP/PQGDSTPEI.PAP\QQPTGGPSR 

CRQALPPQG*RQQPRQRPR/pTQASRSHPAKAKGCQGPPKIRNY 
NIMD 


5783 


5176 


1237 

] 
I 
I 


DRSWMSPlAADSYTDSyrDTYTEAYWVpbLPP'liikPPl-MPPLPPEE 

PPMTPPLPPEEPPEGPALPTEQSALTAENTWPTEVPSLPSEESV 

SQPEPPVSQSEISEPSAVPTDYSVSASDPSVLVSEAAVTVPEPP " 

PEPESS ITLTPVESAWAEBHEWPBRPVTOWSETPAMSAEPT 

VLASEPPVMSETAETFDSMRASGHVASEVSTSLLVPAVTTPVLA 

ESILEPPAMAAPESSAMAVLESSAVTVliESSXVTVLESSTVTVL 

EPS WTVPEPPWAEPDYVTI PVPWSALEPSVPVLSPAVSVLQ 

PSMIVSEPSVSVQESTVTVSEPAVTVSEQTQVIPTEVAIESTPM 

IIiESSIMSSHVMKGINLSSGDQNLAPEIGMQEIALHSGEEPHAE 

BHLKGDFYESEHGINIDLNINNHIilAKEMEHNTVCAAGTSPVGB 

IGEEKI LPTSETKQRTVLDTYPGVSEADAGETLS STGPFALEPD 

ATG\TSKGI3FTTASTLSLVNlCYDVDLSt,TTQDTEHDMLISTSP 

SGGSEADIEGPLPAKDIHLDLPSNINLVSSDTWBPLPVKED\DQ 

TLAALI\SIi:<ESSGaEKEVPPPS*REHr,PDSGFSANIEDINEAD 

LVRPVSSPRTWNVLPSPRAGt,\EGP\LLASDFGPVC3NLYSSPW 

\SSMP\ERASGS\SSGEKGG\YEIFVKVKDTHEKSKKNKNRDKG 

EKEKKRDSSLRSRSKRSKSSEHKSRKLTSESRSRARKRSSKSKS 

HRS\Q1'RSRSRS/RDRRRRSSRSRSKSRGRRSVSKEKRKRSPKH 

RSKSRERKRKRSSSRDNRKTVRARSRTPSRRSRSHTPSRRRRSR 

SVGRRRSPSISPSRRSRTPSRRSRTPSRRSRTPSRRSRTPSRRS 

RTPSRRSRTPSRRRRSRSWRRRSFSISPVRLRRSRTPUUiRFS 

RSPIRRKRSRSSERGRSPKRLTDLDKAQLLEIAKANAAAMCAKA 

GVPLPPNLKPAPPPTlESKVAKKSGGATIEEIiTEKCKQIAQSKE 

DDDVIVNKPHVSDEEEEEPPFYHHPFKI.SEPKPIFPNLKIAAAK 

PTPPKSQVTLTKEFPVSSGSQHRKKEADSVYGEWVPVEKNGEEN 

KDDDNVFSSNI.PSEPVDISTAMSERAIAQKRLSENAPDLEAMSM 

LtnWVQERIDAWAQIA^SIPGQPTGSTGVOVLTQEQIiAirrGAQAWl 

KKDQFLRAAPVTGGMGAVLMRKMGWREGEGLGKNKEGNKEPILV 

DFKTDRKGLVAVGERAQKRSGNFSAAMKDIjSGKHPVSALMEICW 

ecrrwqppefllvhds6pdhrkhflfrvlingsayqpncmfflnr 




1693 



faSS 1 

< 


^SGLRVAFT^mGISNFKTPSKLSEICKKSVI.CSTPTINiPASPFH 
2KLGFGTGVNVYl*MKRSPRGLSHSPWAVKKlMPICNDHYRSVya 

crlmdeakilkslhhpnivgyrafteanix;slciameyggeks7 

IDLIEE/PX*SQ/PKILFQQP/X,ILKVAI,KMARGr.KYXiHQEKKL 
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5NSDOCiD <wo n^s'^^^P^1 i > 



wo 01/53312 



PCT/USOO/34263 



SEQ 
ID 
NO: 


taredicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acia segment containing signal pept^d^ 
(A=Alanine, ^Cysteine, D.AspartL Acid! i- 
2^"^^!*^^.-^^^^; ^-Phenylalanine, G=Glycine, 
H=.Hxstidine, I-Isoleucine, K«Lysine 
L^Leucine, M^Methionine, U^Aaparaginc, 
P-Prolme, Q»Glutamine, R^^Arginine, 
S= Serine, T^Threonine, V=: Valine 
W=Tryptophan, Y^lyrosine, X=UnJcaown, *=stotf 
Codon. /=pos5ible nucleotide deletion 
\==possible nucleotide insertion) 


5784 






i.no^DiA^^NWIKGDFETIKICDVGVSLTLlJKNMTVrDPEAOyiH 
GTEPWKPKEAVEENGVITDKADIFAFGLTLWEMMTLSIPHINLS 
NDDDDEDKTFDESDFDDEAyYAALGTRPPINMEELDESyQKVlE 
LPSVCTNEDPKDRPSAAHIVEALETDV ^^^si^^lE 


5785 


2669 


1388 


FRVRPRVRTOHt^M i iSHiyGPSDSASRPLWV.SilDUMEKDKVKIH 
GILSNTHRQAARVNIiSFDFPFYGHFLREITVATGGFIYTGEWH 
RMLTATQYrAPLMANFDPSVSRNSTVRYFDNGTALWQWDHVHL 
QDNYNLGSFTFQATLLMDGRIIFGYKEIPVLVTQISSTNHPVKV 

«>c * ^^'"^'^^^^^Wi-vjcwt^bwuSKLQSCSSGFDRHRODW 1 
VDSGCFEESKEKMCENTEPVET\Ft.EPPQP*3RQPPSSGS*LPP 
E/DAVTSQFPTSLPTEDDXKIALHLKDNGASTDDSAAEKKGGTI. 
HAGItlVGIIilLVLIVATArLVTVYMYHHPTSAASIFFIE^iPPSR 
WPAMKFRRGSGHPAYAEVEPVGEKEGFIVSEOC 


S78^ 


2669 


X388 


i^KVK^KVRTDH>nnfISRjCYGPSDSASRDLWVijlDQMEKDKVKIHH 

GILSNTHRQAARVNLSFDPPFYGHPLREITVATGGPIYTGEVVH 
RMLTATQY IAPLMANFDPSVSRM«5TvpVT?mTr>»iiaT r--.t»*.,« 
^r^»««,, v£»KN5XVRYFDNGTALWQWDHVHL 

QDNYNI^SFTFQATLI^DGRIIFGYKEIPVbVTQISS^NHPVKV 

3^o^^^^*^^^^^^^^^^SW^S^^^CSSGFDRHRQDW 
VDSGCPEESKEKMCSNTEPVET\FLBPPQP*ERQPPSSGS*I^P 
E/DAVTSQFPTSLPTEDDTKIALHLiCDKGASTDDSAAEOCGGTL 
HAGLIVGIt.lLVLIVA'PAILVTVyMYHHPTSAASXPFIERRPSR 
WPAMKFRRGSGHPAYAEVEPVGEKEGFIVSEOC 


5787 


253i 


1674 


faYKI.PAAERHAAiSCSQPPTPTRRRWPAPGKTSKGHRPQM^SGTP '\ 
APRPPARSTVSPASPLPKPRAGRCGSRPRSACSTFRPC*S1.N*M 
S*H*KRNLSQRSSSMSRRPLSCARPHR**RQGLTVAARLPTWAK 
SPPLACSFCQAAQKSQSLSSaRSTR*PBRMSFRP\SPPGNPAlP L ' 
SLAPSSRP/PKGRPQCTWIPSRWPASPTAPPTTT*APTq<;prcT 

GRSMMTCPTRWTATPWSARASSRPRNWPTP*WRPSGRLSTV*RA 
TCGSTATAPPKRFPRNWNPMMAE ^iv ra 


5788 


2 


1460 


r^AASVTSUU^EVNCPVlCQGTLKEAGSLSNCGTHfa?^^ 

T\RyCEIP\GPD\LEESP\TCP\l,CKEPFRP\GSFRPNWOLANV 

VBNIKRLQLVSTLGLGEEDVCQEHGEKIYFFCEDDEMQLCWCR 

EAGEHATHTKRFLEDAA\APVREQIHKCI.KCLIKEREBIQEIQS 

RENmiQVLLTQVSTKRQQVrSEFAHLRKFLBEQQSILLAQI.HS 

QDGDILRQRDEFDLLVAGEICRPSALIEBLEEKNERPAREIXTD 

IRSTLIRCETRKCRKPVAVSPELGQRIRDFPQQALPUJRBMKMP 

LEKLCPELDYEPAHrSLDPQTSHPKLLI^EIDKQRAQPSYKWQNS i' 

PDNPQRFDRATOOAHTQIl^GGRHTVm^SIDIAHGGSCTVGVVS 

EDVQRKGELRLRPEEGVWAVRIAWGFVSAIiGSFPNTRDTLKEOP " 

RQVRVSLDYEVGWVTFTKAVTREPIYTFTASFTRmpPFGLWG 
RGSSFSI*SS ( 




2 


6860 

. 

1 

1 
] 

< 

j 

1 
C 
G 

T 


atiiiVSiCiKSSAYGDArAEGHPAGPGgVSSSTUAISTTTGHQEGPG 1 

SEGEGEGETEGDVHTSNRLHMVRIWLLERI^TLPQLRNVGGVR 

AIPyMQVILMLTTDI,r)GEDEKDKGALDNLLSQI.IAELGMDKKDV 

SKKNERSALNEVHLWMRLLSVFMSRTKSGSKSSICESSSLISS 
f^TAAAHiSSGAVDYCLHVr.if <!T r trvuvrc;r^r\kTr^v?»r%-tr»m,1»* • i 

^^^^!^^^^^^''^^^^^^VFEAYlX)U:.TEMVr,HLPYQl 

;rCGSKEKYRQLRDIJmJDS\HVRGIKKLLEEQGIFLRAS\AnrA 
>PQSALQYDTHSLMEHLKACAEXAAQRTINWQKFCIKDDSVLY 
•LU2VSFLVDEGVSPVLI^rjUSC7U.CGSKVLRAIAASSGSSSAS 
>SPAPVAASSGQATTQSKSSTKICSKKEEKBKEKDGETSGSQEDQ 
.CmLVWQI^IKFADKErLIQFUiCFX.LESWSSSVRWQAHCXTLH 
YRNSSKSQQELLI»DLMWSIWPEr,PAYGRKAAQFVDLLGYFSI,K 
'PQTEKKLKEYSQKAVEILRTQNHILTNHPMSNriyNTLSGLVEP 
K;YYLESDPCLVCNNPEVPFCYIKLSSIKVDTRYTrrQQVVKr,I 
SHTISKOTVKIGDLKRTKMVRTINLyyNNRTVQAIVEr.KNKPJV 
WHiCAKKVQLTPGQTEVKrDLPr.PrVASNLMXEFADPYEmrQA# 
ETLQCPRCSASVPANPGVCGNCGEMVYQCHKCRSIWYDEKDPF 
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BNSDOCIO: cWO 015331 2A1_L> 



wo 01/53312 



PCT/USOO/34263 



SEQ 
ID 
NO: 


PreciicUed 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Jfredicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acxa segment containing signal peptldT^ 
(A=Aianine, C^Cysteine, D=Aspartic Acid, E*, 
S F=Phenylalanine, G^Glycine, 
H=Hisfcidine, I^lsoleucine, K«Lysine, 
L=Leucine, M^Methionine, N^Asparagine 
P=Proline, Q^Glutamine, R=Arginine, 
S^Serine, T=Threonine, V=»Valine, 
W=Tryptophan, Y«Tyroaine, X-0nkaown, *.Stop. 
Codon, /-possible nucleotide deletion, 
- ^"P"^^^^^^ nucleotide insertion) 


5789 






LCNACgFCKyAKFDFMD^AKPCCAVDPJENbbUKKKAVSMINTL ' 

LDKADRVYHQtiMGHRPQLENLLCICVNBAAPEKPQDDSGTAGG^S 

STSASVWRYILQLAOEyCGDCKNSFDELSKIIQKVFASRKELr.E 

YDLQQREAATKSSRTSVQPTPTASQYRALSVLGCGHTSSTKCyG 

CASAVTEHCITLLRALATNPALRHILVSQGLlRELPDYNIiRRGA 

AAMREEVRQLMCLLTRDNPEATQQMNDLIIGKVSTALKGHWANP 

DLASSLQYEMLLLTDS ISKED5CWELRLRCALSLFLMAVNIKTP 

VVVENITLMCLRILQ^aIKPPAPTSKKNKDVPVEALTTVKPYC^I 

EIHAQAQLWLKRDPKASTOAWKKCLPIRGIDGNGKAPSKSEnUm 

LYI^TBKYVWRWKQFLSRRGKRTSPLDLKLGHNNWLRQVLPTPAT 

QAAROAACTIVEALATIPSRKQQVLDLLTSYLDELSIAGECAAE 

YtALYQiaL-TSAHWKVYIJ^GVLPYVGNLITKEIARLIAI,EEA 

TLSTDLQQGYAUCSLTGLLSSPVEVESIKRHFKSRLVGTVTiNGY 

LCI^KLWQRTiaiDETQDMLLEMriEDMTTGTESETKAnWCI 

ETAKRYNI.DDYRTPVFIFERLCSIIYPEENEVTBFFVTLEtCDPQ 

QEDJU3GRMPGNPYSSNEPGIGPI.MRDIKNKICQDCDLVALLED 

DSGMELLVNNKIISLDLPVAEVTKKVWCTTNEGEPMRIVYRMRG 

LLGDATEEFIESLDSTTDEEEDEEEVYKMAGVMAQCGGI.ECMI*N 

RLAGIRDFKQGRHU:.TVLI*KLFSYCVKVKVNROOLVKLBMmi^ 

VMLGTLNIALVAEQESKDSGGAAVAEQVLSIMEIMCAEPNVEP 

LSEDKGNLI*LTX3DICDQX*VMLtJDQlWSTFVRSNPSVIK;GI*IiRIIP 

YLSFGEVEKMOILVERFKPYCNPDKYDEDHSGDDKVFL\DCFCK 

IAAGIK\NNSNGHQL\KDI,\ILQKGITQKALD\YMKKHIP/SAA 

RlWDADI\WKSFauRPALPFIt;RI,LRGLAIQHPGTQVr.lGTDSI 

PNLHKI*EQVS\SDEGIGTLA\ENL\LESLREHPDVNKK1DA\AR 

RBTRAEfCKRMAMAMRQKALGTLG \MTTMEKGQWD /TRTALLEA 

dweelieepXgltccicregykfqptkvlgiytftkrwlggvw 

EWKPRETSRATSTVSHPWIVHYDC\HIA\AVSLARGREEWESAA 

IX3NANTKC3!gGLLPWGPHVTESAPATCLARHNTyLQECTGQREP 

TYQLNIHDIKLLPLRPAMBQSFSABTGGGGRBSNIHLIPYIIHT 

GIiYVLNlTRATSREBKNLQGFI^OPKEKWVESAFEVIXSPyYFTV 

lALHI liPPEQWRATRVEILRRLLVTSQARAVAPGGATRLTDKAV 

KDYSAYRSSIibFWALVDLlYNMPKKVPTSNTEGGWSCSLAEYIR 

HNDMPIYEAADKALKTFQEEFMPVETFSEFLDVAGLLSEITDPE 
SFLfCDLLNSVP 


5790 


1 


2407 

< 


LPLHAVEKTtJKPGQPALKMPGKIJRSDAGbESDTAMKKGETIiRKQ 

TEEKEKKEKPKSDKTEEIAEEEETVFPKAKQVKKKAEPSEVDMN 

SPKSKKAKK\KEEPSQNDISPKTKSLRKKKEPIEKKWSSKTKK 

VTKNEEPSEEEIDAPKPKKMKKEKEMNGETREKSPKLKNGFPHP 

EPDCNPSEAASEESNSEIEQBIPVEQKEG\AFSNFPISEETIICL 

LKGRGVTPIiPPlQAKTFHHVYSGKDLIAQARtGTGKT^SFAIPI, 

lEKLHG\Er<JDRKRGRAPQVI,VLaPTRELANQVSKDFSDITiacI. 

SVACFYGGTPYGGQFERMRNGIDILVGTPGRlKDHIQNQKrJDLT 

KLNHVVLDEVDQMLDMGPADQVEBILSVAYKKDSEDNPQTLLPS 

ATCPHWVFNVAKKYMKSTYEQVDLIGKKTQKIAITVEHIMKCH 

WTQRAAVIGDVIRVYSGHQGRTI IFCETKKEAQELSQWSAIKQD 

AQSLHGDIPQKQRBITLKGFRNGSFGVLVATNVAARGLDIPEVD 

LVIQSSPPKDVESYIHRSGRTGRAGRTGVCICFYOHKEEYQLVQ 

VEQKAGIKFKRIGVPSATEIIKASSKnAIRIaUJSVPPTAISHPK 

OSAEKLIEEKXSAVEAIJLWMISGATSVDQRSLINSm^GFVTM 

ILQCSIEMPNISYAWKELKEQLGEEIDSKVKGMVFI,KGKLGVCF 

DVPTASVTEIQEKWHDSRRWQLSVATEQPELEGPREGYGGFRGQ 

REGSRGFRGQRDGNRRFRGQREGSRGPRGQRSGGGNKSNRSQNK 

3QKRSFSKAFGQ 




3786 


1585 J 
< 
1 
I 
I 
C 
I 


\RRQRDPLQALRRRyQELKQQVDSLLSESQLKEAI^PNi^OHIY 

5RCIQLKQAIDENKNALQKLSKA0ESAPVANYNQRKEEEHT1XD 

aTaCJLQGIAVriSRENirEVGAPTEEEEESESEDSEDSGGEEE 

3AEEEEEEKEENESHKWSTGEEyiAVGDFTAQQVGDLTPKKGEI 

.LVIEKKPDGWWlAKDAKGWEGIiVPRTYIiEPYSEEEECQESSEE . 

JSEEDVEAVr>BTADGAEVK\QRTDPHWSAVQKAISEAGIFCI.Vf 

tVSFCYLIVLMRNRMETVEDTNGSETGFRAWNVQSRGRlFIiVSK 
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t SEQ 
ID 
NO: 


Predicted 
besrinning- 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide"" 
fA==Al an ine, C=Cysteine, D=Aspartic Acid. E:^ 
Glutamic Acid, Ffc= Phenylalanine. G=Glycine, 
H=Hlstidine, I^Isoleucine, K=Lysine, 
L«Leucine, M^Methionine, N^Asparagine , 
P-Proline, Q==Glut amine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W-Tryptophan, Y« Tyrosine, X^UnJcnown, *=Stop 
Codon, /=poS3ible nucleotide deletion, 
\-possible nucleotide insertion) 








pvlqointvdvlttmgaipagfkpstlsqlleegnqfranVj^^q"" 

PEUVIPSQLAKRDLMWDATEGTIRSRPSRISLILTLWSCKMIPIjP 
GMSIQVLSRHVRLCLFDGNKVLSNIHTVRAWQPKKPKTWTPSP 
QVTRILPCLLDGDCFIRSNSASPDljGILFEIiGISyiRNSTGBRG 
EI*S CaWVFLKLFDASGVPI PAKTVBLFLNGGTPYEKG lEVDPS I 
SRRAHGSVFYQIMTMRRQPQLLVKLRSIiNRRSRNVIjSLLPETIiI 
GNMCSIHLLIFYRQILGDVIxLKDRMSIiQSTDLISHPMIATFPMI, 
I>EQPDVMDALRSSWAGQES\TLKRSEKR\PK3Fr>KVPRFLLVYH 
\GCVLPI»L/HTPTRLPPFRWAEEETETARWiCVITDFLKQNQENQ 
GALQALLSPDGVHEPPjDLSEQTYDPLGEMRKNAV 


5731 


3 


1636 


LRVAEFAGTSR/ IGAGLIQPLHRAPARDHGXjiZjRGGAAPAtisVSH^ 

GN/GKQL/AMSSQGSDDEQIKRENrRSLTMSGHVGFESLPDQliV 

NRSIQQGFCFNILCVGETGIGKSTLIDTLFNTNFEDYESSHFCP 

NVKIiKAQTyEU2ESNVQb!CLTIVNTVGFGDQINKEESYCl?IVDY 

IDAQFEAYLQEEIiKlKRSIiPTYHDSRIHVCLYFISPTGHSUCTL 

DliLTMKNLDSKVYI 1 PVIAKADTVSKTEI»QKFKIKr*MSBLVSNG 

VQIYQFPTDDDTIAKVNAAMNGQLPFAVVGSMDEVKVGNKMVKA 

RQYPWGWQVENEKHCDFVnCLREMLICTNMEDLREQTHTRHYEI, 

YRRCKLEBMGFTDVGPEWKPVSVQETyEAKRHBFHGBRQRKEEE 

MKQMFVQRVKEKEAILKEAERKLQAKPEHIiKRIiHQEERMKliEEK 

RRI*LEEEI I AFSKKKATSEI FHSQS FIATGSNLRKDKDRKNSQF 

PVKQKVPEHRRSSSQANFIKKKLEVCFDFAVICFITSIFGEQPQ 

LLI FMEKYFOVQGQYISQS E 


S792 


2263 


653 


AAAAPSPAWWCaVFWYWHTCWVMYGIVYTRPCSGDASCIQPY ' 

IimRPKIX3I*\HHSFTTTRSHLGAEKNIDLVLNVEDFDVESKPER 

TVNVS VPKKTRNNGTL YAYI FLHHAGVLPWHDGKQVHr.VSPLrT 

YMVPKPEEINIiLTGESDTQOIEADKKPTSALDEPVSHWRPRLAIi 

NVMADNPVFDGSSLPADVHRYKKMIQLGKTVHYLPILFIDQLSN 

RVKDLMVINRSTTEXiPr,TVSYDKVSIiGRriRFWlHMQDAVYSLQQ 

FGPSEKDADEVKGIFVDTNXjYFLALTFFVAAFHLIiPDPLAFKSID 

ISFWKKKKSMIGMSTKAVLWRCFSTWIFLPIiUDECyrSLriVIiVP 

AGVGAAIELWKVKKALKMTIPWRGLMPEFQFGTYSESERKTEEY 

DTQAMKYLSYLLYPLCVGGAVYSLLNIKVKSWYSWLIHSFVNGV 

YAFGFLFMLPQLFVNYKLKSVAHLPWKAFTYKAFNTFIDDVPAF 

IITMPTSHRLACFRDDWFIiVYXYQRWLYPVDKRRVWEFGESYE 

EKATRAPHTD 


5793 


2263 


6S3 


AAAAPS PAWWCGVFWYWHTCWVMYGI VYTRPCSGDASCIQP Y 
IARRPKLQL\RHSFTTTRSHLGAENNIDXiVLNVEDFDVESKFER 
TVNVSVPKKTRTmGTLYAYI PLHHAGVLPWHDGKQVHLVSPLTT 
YMVPKPEEINLLTGESDTQQIEADKKPrSALDEPVSHWRPRlAI, 
NVMADNFVFDGSSLPADVHRyMKMIQIiGKTVHYLPII,PIDQLSN 
RVKDLMVINRSTTELPLTVSYDKVSLORLRFWIHMQDAVYSLQQ 
FGFSEKDADEVKQIPVDTNLYFLALTFFVAAFHLLFDFIiAFKND 
1 SFWKKKKSMIOKSTKAVLWRCFSTWI FliFLLDEQTSLLVLVP 
AGVCAAIELWKVKKALKMTIFWRGLMPEFQPGTYSESErRKTEEY 
DTQAMKYLSYLLYPIiCVGGAVYSLLNXKYKSfJYSWIirWSPVNGV 
YAFGFIiFHLPQLFVNYKLKSVAHLPWKAFTYKAFNTFIDDVFAF 

Ail l^iir' J. aHKlMK^t KVD\rvFJjVxJSxQR^ 
EKATRAPHTD 


5794 


1 


5016 


MGPRLSVWIjIjIiLPAALI^HEEHSRAAAKGGCAGSGCGKCDCHGV 
KGQKGERGLPGLQGyiGFPGMQGPEGPQGPPGQKGDTGEPGI.PG 
TKGTRGPPGASGYPGNPGIiPGIPGQDGPPGPPGIPGCNGTKGER 
GPIiGPPGLPGFAGNPGPPGLPGMKGDPGEILGHVPGMLLKGERG 
FPGIPGTPGPPGLPGLQ3PVGPPGFTGPPGPPGPPGPPGEKGQM 
GLSFQGPKGDKGDQGVSGPPGVPGQACJVQEKGDFATKGEKOQKG 
EPGFQGMPGVGEKGEPGKPGPRGKPGKDGDKGEKGSPGFPGEPG 
YPGLIGRQGP\QGEKGEAGPPGPPGIVIGTGPIiGEKGERGYPGT 
PGPRGEPGPKGFPGLPGQPGPPGLPVPGQAGAPGFPGERGEKGD 
RGFPGTSLPGPSGRDGLPGPPGSPGPPGQPGYTNGIVECQPGPP 
GDQGPPG IPGQPGF IGE IGE KGQKGESCLICD XDGYRGPPGPQOf 
PPGEIGPPGQPGAKGDRGLPGRDGVAGVPGPQGTPGLIGOPGAK j 
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SEQ 
ID 
NO: 



Predicted 
beginning 
nucleot ide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted eHa" 

nucleotide 

location 

CO rre spondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 



Ammo acxa segment containing gignal pepti<3 ^ 
(A=Alanine, C^Cysceine. D;=Aspartic Acidf 
Glutamic Acid. F=Phenylalanine, G=K31ycine 
H=Histi4ine, I=Isoleucine, K^Lyeine, 
L=Leucine, M-Methionine, N^AsparagiAe 
P=Proline, Q-Glutamine, R=Arginine, 
S»Scrine, r=:Threonine , v=Valine, 
W=Tryptophan, Y=Tyro3ine, X^UnJcnown, *=stop' 
Codon, /^possible nucleotide deletion 
\=possible nucleotide insertion) 



PKGSPGSVGLKGERGPPGGVGFPGSRGDTGPPGPPGYGPAGP* 
DKGQAGFPGGPGSPGLPGPKGEPGKIVPLPGPPGAEGLPGSPr' 
PGPQGDRGFPGTPGRN PGL\ PGEKO a v«\ nDi^r/-n.T*/^»^*s««„. 




61 



5736 



5798 



$44 



5739 



2679 



1078" 



"*^'^^^"«"*^«f^7VUUOKGcPGVGLPGI*KGr» 

PGLPGIPGTPGEKGSIGVPGVPGEHGAIGPPGLQGIRGEPGPPG 
LPCSVGSPGVPGIGPPGARGPPGGQGPPGLSGPPGIKGEKGFPG 
FPGLDMPGPKGDKGAQGLPGITGQSGX.PGLPGQQGAPGIPGFPG 
SKGEMGVMGTPGQPGSPGPWGAPGLPGEKGD\HGFPGSSGPRGD 
PGLKGDKGDVGLPGKPGSMDKVyMGSMKGQKGDQGEKGQIGPIG 
EKGSRGDPGTPGVPGKDGQAGQPGQPGPKGDPGISGTPGAPGLP 
GPKGSVGGMGLPGTPGEKGVPGIPGPQGSPGLPGDKGAKGEKGQ 
AGPPGIGIPGLRGEKGDQGIAGFPGSPGEKGEKGSrRTPnMTv-c 



r^xjivv,oi'u:3vyxt^aPGl.PGEKGDKGLPGUX5IPGVKGEAGL^G 
TPGPTGPAGQKGEPGSDGIPGSAGEKGEPGLPGRGFPGPPGAKG 
DKGSKGEVGFPGLAGSPGIPGSKGEQGFMGPPGPQGQPGLPGSP 
GHATEGPKGDRGPQGQPGLPGLPGPMGPPGLPGIDGVKGDKGKP 
GWPGAPGVPGPKGDPGFQQMPGIGGSPGITGSKGDMGPPGVPGF 
QGPKGLPGU3GIKGDQGDQGVPGAKGLPGPPGPPGPYDIIKGEP 
GLPGPEGPPGLKGLQGLPGPKGQQGVTGLVGIPGPPGIPGFDGA 
PGQKGEMGPAQPTGPRGFPGPPGPIX5LPGSMGPPGTPSVDHGFL 
VTRHSQTIDDPQCPSGTKILYHGYSLLYVQGNERAHGQDLGTAG 
SCUiKFSTMPFLFCNrNKVCNPASRNDYSYWLSTPEPMPMSMAP 
ITGRNIRPFISRCAVCEAPAMVMAVHSQTIQIPPCPSGWSSLWI 
GYSPVMHTSAGAEGSGQALASPGSCI^EFRSAPFIECHGRGTCN 
YYANAYSPWLATIERSEMFKKPTPSTLKAGEIJiTHVSRCQVCMR 



STRSPTVEYISAHPHXLFMLLK GyERPOIALRCGIMLRECIRHE - 

PLAKIILFSNQFRDFFKYVELSTFDIASDAFATFKDLLTRHKVI. 

VADFl.EQNYDTIFEDYEKLI^SSNY\n^K3iQSLia:^EI.ILDRHN 

FAIMTKYISKPENLKLMMNLLRDKSPNIQPEAFHVFKVFVASPH 

KTQPIVEILLKNQPKLIEFLSSFQKERTDDEQFADEKWYLIKQI 
RDIjKKTAP»RALRDSKR 

UKVGWEX.WCMYISPPKD WWDAGDPSI.PIRTPAMIGCSFVVNR 
FGEIGLLDPGMDVYGGENrTRT.rtTTfxnirf./-v-/-c?Mi?trr r,^r,«,r«,. 



IKF 



891 



115 




*v*v*^iivo«A^,i? X AiU<«AlaRVAEVWMDDyKSHVyiAWNLPrjSKrP 

GIDIGDVSERRALRICSIiKCKNFQWYLDHVYPEMRRYNNTVAYGE 

UlNWK?VKDVCLDQGPLE>mAILYPCHGWGPQI^YTKEGFWIL 

GAl^TTTLLPDTRCLVDMSKSRLPQLLDCDKVXSSLYKRWNFIQ 

WGArMNKGTGRCLEVENRGLACIDLII^SCTGQRWTIKNSIK*R 

EGAGALEPGPQDMAAPPNIWTSCPGGETARGRQVLDGPPRASPG 
QHRDPG 



PitVRQKTLVDvTLENSMIKD QlKNUjQTYEASHDKLREKQRQLE 
VAQVENQIiLKMKVESSQEANAEVMREMTKKJbYSQYEEKLQEEQR 
KHSAEKEALLEETNSPLKAJIEEANKKMQAAEISLEEKDQRIGEL 
DRLIERMEKSRHQLQLQLLEHBTEMSGELTDSDKERYQQLEEAS 
ASLRERIRHLNDMVHCQQKKVKQMVEEIESLKKKLQQKQLLILQ 
LLEKISFLEGENNELQSRLDYLTETQAKTEVETREIGVGCDLLP 
S QTGRTREIVMPSRKYTPYTRVIiELT MKKTtiT 
KII^aKWKSNSNQEKQPYYEEQARLSKI HLEKYPKYKYKPRPKR " 
TCIVDGKKLRIG3YKQLMRSRRQEMRQPPTVGQQPQIP1TTGTG 
WYPGAITMATTTPSPQMTSDCSSTSASPEPSLPVIQSTYGMKT 
P ggSLAGNEMIMGEDEMEMYDD YEDDPKSDYSSENEAPEAVSAM 



LLSTrXKFINLF?ETKA?IQGVLRAGS QI^RNAriVELQQRAVEYL ' 
TLSSVASTDVLATVLEEMPPFPERESSILAKLKRKKGPGAGSAL 
DDGRRDPSSNDINGGMEPTPSTVSTPSPSADLLGLRAAPPPAAP 
PASAGAGNLLVDVFDGPAAQPSLGPTPEEAPI>SPGPEDIGPPIP 
EADELLMKFVCKNNGVLPEITQLLQIGVlCSEFRQNLGRMyLFYGN 
KTS VQPONFS PTWHPGDLQTQLAVQTKR VAAQVDGGAQVQQVL . 
WlECIiRDFLTPPLLSVRFRYGGAPQAl.TLiaPVTINKPFQPTEM''' 
ARQDFFQRWKQLSLPQQBAQKIFKANHPMDAEVTKAKLIiQFQSA 
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SEQ 
ID 
NO: 


precticted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


predicted end 

nucleotide 

location 

CO r r e sp ond i ng 

to firct 

amino acid 

residue of 

amino acid 

se<juence 


Amino acxd segment containing signal peotiH^ 
{A=Alanine, ^Cysteine, D=iWp^ctic Acid^ E = 
Glutamic Acxd, Phenyl alanine, G^Glycine, 
H-Histidxne, Icrsoleucine, K-Lysine/ 
^^Leucinc, M=Methionine, N^Asparagine, 
P^Proline, Q=:Glutamine, R=Arginine, 
S=Serine, T^^xhreonine, V=Veline 
W=Tryptophan, Y^xyrosine, X=Un)cAown, *=.Stop. 
codon, /^possible nucleotide deletion. 
\=possible nucleotide insertion) 


5800 








5801 


2679 


1435 


TLSSVASTDVIATVLESMPPFPERES^II^KI^KRKKGPGAG^ 

DDGRRDPSSNDINGGMHPTPSTVSTPSPSADLLGLRAAP^PAAP 

PASAGAGNLLVDVFDGPAAQPSIX3PTPEEAPLSPGPEDXGPPIP 

EADELLNKPVCKNNGVLFENQLLQIGVKSEFRQNXXSRHYIbFYGN 

KTSVQFQNPSPIVVHPGDLQTQIAVQTKRVAAQVDGGAQVQQVL 

NIECLiy5FLTPPLLSVRFRYGGAPQALTLKI,PVTINKPP0PTEM 

AAQDFFQRWKQLSLPQQEAQKIFKANHPMDAEVTKAKLLGFGSA 


5802 


3 


1413 


l>PRLYHLIPwElTSIKINHVJL>PSESLSlRl.VGGljKTPl.VHIII 
QHIYRDGVIARDGRLLPGDIILKVNGMDISNVPHNYAVRLLRQP 
CQVLWLTVMREQKFRSRNtJGQAPDAYRPRDDSFHVILNKSSPEE 
QLGrKLVRKVDEPGVPIFNVLDGGVAYRHGOLEENORVLAINGH 
DLRYGSPESAAHLIQASERRVHLWSRQVRQRSPDIFQEAGWNS 
NGSWSPGPGERSNTPKPLHPTITCHEKWWtQKDPGBSLGMTVA 
GGASHREWDI^PrYVISVEPGGVlSRDGRIKl-GDILLNVDGVELT 
BVSRSBAVALLKRTSSSIVLKALEVKEYEPQEDCSSPAAI*DSNH 
NMAPPSDWSPSWVMWLELPRCLyNCKDIVI.RRNTAGSLGFCIVG 
2rf!!I^^^^^^*^^^'^^'^^^^^^^^^<5J>^^VMGRSTSG 
MIHACLARLLKELKGRIXLTIVSWPGTPL 


5803 


3 


290 


CF5I,YQlMEklM0LPTIiLRHAl:mMFSVGGl,FWMi>llIRITLCTj, 
^^F^^SPLDFVPEALFGIIXSFLDDFFVIFLLLIYISIMYREV 


5804 


2234 


1299 


liAQFGTTAEI YAyR£EQDFGIKIVKVKAIGRCjRlr-KVI.ELRTQSD 
GIQQAKVQII,PHCVj;,PSTMSAVQLESrjsJKCQlFPSKPVSREDOC 
SYKWWQKYQKRKTOCAWLTSWPRWLYSI.YDAETLMDRIKKQr^ 
KDENLKDDSLPSNPIDFSYRVAACLPIDDVj:,RrQIj:,KIGSAIOR 
LRCEI^IMNKCTSLCCKQCQETEITTKNBIPSLSLCGPKAAYVW 
PHGYVHETJCTVYKACNLNLIGRPSTEHSWFPGVAWTVAOCKICA 
SHIGWKrTATKiCDKSPQKFWGLTRSAXJ:,PTlPDTEDEISPDKVI 


5805 


J. 


1707 


iiWl^KQRgEEQRKRTJjEEKKRRiEQPMI^KRXTOREIAKRAEQIE^ 

DINNTGrESASEEGDDSU[,rTWPVKSYKTSGKMKKNFEDLBKE 

REEKERIKYEEDKRIRYEEQRPSLKEAKCLSt.VMDDEIESEAKK 

ESLSPGKLKLTFEELERQRQENRKKQAEEBARKRbEEEKRAPEE 

ARRQMVNEDEENQDTAKIFKGYRPGKLKLSFEEMERQRREDEKR 

KAEEEARRRIEEEKKAFAEARRNMVVDDDSPEMYICriSQEFI.TP 

GKLEINFEELLKQKMEEEKRRTEEERKHKLEMEKQBFEQLRQEM 

GEEEEENETFGLSREYEELrKLKRSGSlQAKNLKSKPEKIGQl^ 

EKEIQKKIEEERARRRAIDLEIKEREAENFHEEDDVDVRPARKS 

EAPFTHKVNMKARFEQMAKAREEEEQRRIEEQKLLRMQPEQRBI 

DAALQKKRBEEEEEEGSIMWGSTAEDEEQTRSGAPWFKKPLKNT 

SWDSEPVRFTVKVTGEPKPEITWWFEGEILQDGEDYQYIERGE 

TYCLYLPETFPEDGGEYMCKAVNNKGSAASTCILTTPqKW 


5806 


3 


776 

: 


riSPTX/3QVYKSiCIRWWI£ENUiNGNISVDDl.lALLDIAEHASS 
ftPKESQQQSEDREYEVKERLYPKSKRRyDTYNIAGYQGEIEVGL 
SfTIQILQLIPFFDNKNELSKRYMVWFVSGSSDIPGDPNNEYKLA 
OrajYIPYLTKLKFSLKKSFDPFDEYFVLLKPRNNIKQNEEAKTR 
^KVAGYFKKyVDIFCLLEESQNNTGI^SKFSKPLQVERCRRKLV 
U>KADICFSGLLEYr,IKSQEDAlSTMKCIVNEyTFLLK 


5807 


1257 


877 i 

I 


Wlr-TFHNHGRTANLYSLHSHi^XXTVFI.FALWLGFAVPLLPW 
^MWLRSLLKPIHVFFGAAILSLSIASVISG1NEKLPPSI.KNTT 
IPYHSLPSEAVFANSTGMLWAFGLLVLYirjASSWKRP 




22G7 


1302 E 
£ 

i 


t*:SKKTFRRPMAVDIQPACL&£,YCGlCri:.LFiCNGSTEiYGECGVC 
RGQRTNAQKYCQPCTESPELYDWLYLGFMAMLPLVtiHWFFXEW 
SGKKSSSALFQHITALFECSMAAl ITLLVSppVGVTiYIRSCRV 

wlsdwytmlynpspdyvttvhctheavyplytivfiyyafclI 
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SEQ 1 Predicted 
ID beginning 
NO: nucleotide 
[ location 
j corresponding 
to first 
amino acid 
residue of 
amino acid 
j sequence 


1 Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
seqpience 


Amrno acid segment conLaxning signal" peptid^ 
(A.Alanxne, C=Cysteine, D=.Aspartic Acid e- 
Glutamic Acid F-Phenylalanine, G^GlyciAe, 

HsHistldine 1- Tcnl r»i»*r»-; rj> r • 

L=Leucine, M=Methionine, N^Asparaglne, 
P«Proline, Q=Glutainine, R=Arginine, 
S=Serine, T=Threonine, Vr»Valine, 
W=.Tryptophan, Y^Tyrosine, X=UnkAown, *-Stoi> 
Codon, /=pQssible nucleotide deletion 
\=possible nucleotide insertion) 


b808 2 




i^MLl.RPLI>VKKIACGI^KSDRPKSIYAALYFPmmOM^^ 
GLLYYAFPYIILVLSLVTIAVYMSASBIENCYDIXVRKKRLI^^ 


ba09 464 


433 


SLPDSGWEYLSNGGVADWHKDFGEbRYNECLMMFSOyGKNGSS-" 

EGRITHGFOLKSAYENNIJqPYTNYTFDFKGVIDYlPYSKTHMNV 

LGVLGPLDPQWLVENNITGCPHPHIPSDHFSIiLTQI,EIJlPPLLP 
LVNGVHLPNRR v^'^fviiuf 




2422 


XLVPGFQGXLHPGVYCALQSQHQAQELVADIDBCEVSGLCRHGG " 

RCVNTHGSFECYCMDGYLPRNGPEPFHPTTDATSCTEXDCGTPP 

EVPDGYIIGNYTSSLGSQVRYACREQFFSVPEDTVSSCTGLGTW 

ESPKLHCQEINCGNPPEMRHAILVGNHSSRLGGVARYVCOEGPE 

SPGGKITSVCTBKGTWRESTLTCTEILTKIHDVSLFNDTCVRWQ 

INSRRINPKISYVISXKGORLDPMESVREETVNLTTDSRTPEVC 

LALYPGTNYTVNISTAPPRRSMPAVIGFQTAEVDLLBDDGSFNI 

SIFNETCLKLNRRSRKVGSEHMYQFTVr^QRWYIANPSHArSEN 

FTTREQVPWCLDLYPTTDYTVNVTLLRSPKRHSVQITIATPPA 

VKQTlSNISGPNETCl.RWRSIKTADMBEMyLFHrWGQRWYQKEF 

AQEMTFNISSSSRDPEVCLDLRPGTNYNVSLRAI^SBLPVVISL 

TTQITEPPLPEVEFFTVHRGPLPRLRLRKAKEKMGPISSYQVLV 

LPLALQSTFSCDSEGASSFFSNASDADGYVAAELLAKDVPDDAM 

BIPIGDRLYYGEYYNAPLKRGSDYCIILRITSEWMKVRRHSCAV 

WAQVKDSSLMLLQMAGVGLGSLAWIXLTFLSPSAV 


5810 3 
t;m T 1 1 r.1 A — * - 


1641 


&VFGTHKDHEVS-IXDTAISAVKVQIAEF1JSNLQEKSJ^IEAFVS 

BIESPFNTIEE>7CSKNEKRLEEQNEEIS®/IKKVLAQYDEKAQSFEE 

VKKKK^5EFIaEQMVHFllQSMDTAKDTI£TIVREAEBLDEAVFLT * 

SFEEINERLLSAMESTASLEKMPAAFSLFEHYDDSSARSDOMLK 

QVAVPQPPRLEPQEPNSATSTTIAVYWSMNKEDVIDSFQVYCME 

EPQDDQEVNELVEEYRLTVKBSYCIPKDLEPDRCYQVWVMAVNF ' 

TGCSLPSERAIFRTAPSTPVIRAEDCTVCWNTATIRWRPTTPEA 

ini I A^Jt^-KgHSPEGEGLRSPSGlKGLQUCVNLQPNDNYFFYV 

RAIWAFGTSEQSEAALISTRGTRPLLLRETAHPALHXSSSGTVI 

SFGERRRLTEIPSVLGEELPSCGQHYWETTVTDCPAYRLGICSS 

SAVQAGALGQGETSWYMHCSEPQRYTFFYSGIVSDVHVTERPAR 

VGI.LLDYNNQRLXFIKAESEQLI.FIIRHRFNBGVHPAFALEKPG 

KCTIiHIiGIBPPDSVRHK ^^^t^ 


S812 




851 


AAAZADPI>PEDJ^SAEKRRPLKSSLGYEXTFSi.U^PDPlCSHDVY 
WDIEGAVRRYVOPFLNALGAAGNFSVDSQILYYAMLGVNPRFDS 
ASSSYYU)MHSl.PHVINPVESRX/?SSAASLYPVLNPrJLYVPEIA 
HSPLYXQDKDGAPVATNAPilSPRWOOlMVYKVDSKTYNASVLPV 
RVEVDMVRVMBVFLAQLRXJ:.PGIAQPQLPPKCI,LSGPTSEGIJ^r 
WELDRLLWARSVENIATATTTLTSlAQnLGKISNIVXKDDVASE 
VYKAVAAVQKSAEELASGHIiASAFVASQEAVTSSELAFPDPSr^ 
^YFPDDQKFAIYIPLFLPMAVPILLSLVKIFLETRKSWRICPE 




5204 


2744 

■ : 
] 
< 
I 
I 
I 
c 


OGRQRCQRGRSCGAREBEVEPGTARPPPAASAMUASLEKIADPr"" 

LAEMGKNLKBAVKMLEDSQRRrEEENGKmCSGDIPGPXOGSGQ 

DMVS 1LQLVQNLMHGDEDEEPQSPRIQNIGECGHMALLGHSI/3A 

YISTLDKEKLRKLTTRILSDTrLWLCRIFRYENGCAYraEEERE 

SLAKICRIiAIHSRYEDFWDGFNVljYNKKPViyLSAAARPGLGO 

JfLCNQLGbPFPCLCRVPCNTVFGSQHQMDVAFLEKLXKDDIERG 

RLPLLLVAHAGTAAVGHTDKIGRLKELCEQYGIWLHVEGVNIAT 

CAI/3yVSSSVXAAAKCDSMTMTPGPWX/5LPAVPAVTLYKHDDPA 

[.TLVAGLTSNKPTDKX,RALPLWI.SLQYLGLDGFVERIKHACQbS 

3RLQESLKKVNYIKILVEDELSSPVWFRFFQELPGSDPVFKAV 

>VPKMTPSGVGRERHSCDALNRWLGBQLKQLVPASGLTVMDLEA 

:gtclrfsplmtaavlgtrgedvdqlvacxesklpvlcctik3LR 

^EFKQEVEATAGLLYVDDPNWSGIGWRYEHANDDKSSLKSYPQ 
JENIHAGLrjCKX^WELESDLTFKIGPEYKSMKSCLYVGMASDNVH 

aelvetxaatareiednsrllenmtewrkgxoeaqvelqk* 
•erleteegvlirqi pwgs vlmwfspvoalqkgrtfni^tagsles 
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~SEQ" 
ID 
NO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



5815 



2936 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



699 



432 



i^unino acid aegment containing signal peptide 
(A^Alanine, C^Cysteine, D=Aspartic Acid, E=: 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histldine, I=Isoleucine, K=Lysine, 
Ii=Leucine, M^^Methionine, N»Asparagine, 
P^Proline, Q=Glut amine, R=Arginine, 
S^Sarine, T=Threonine, V=Valine, 
W-Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 
TEPIYVYKAQGAGVTLPPTPSGSRTKQRLPGQ RPFKKSLRG^DA 
LSK'l-SSVSHIEDLEKVERLSSGPEQITLEASSTEGHPGAPSPOH 
TDQTEAFQKGVPHPEDDHSQVEGPBSLR 

HRDGVSGSi,ERPLTDRSRTGAFAQQRGKMATAGGGSGADPGSRG 
LLRLLSFCVLLAGliCRGNSVERKiyiPr^NKTAPCVRLLNATKQI 
GCQSSXSGDTGVXHVVEKEEDLQWVLTDGPNPPYMVLLESKHFT 
RDLMEKLKGRTSRIAGLAVSLTKPSPASGFSPSVQCPNDGFGVY 
SNSYGPEFAHCREIQWWSLGNGIAYEDFSFPIPLLEDENfeTKVl 
KQCYQDHNLSQNGSAPTFPLCAMQLPSHMAWLSPSTAT\CMRRS 
SIQSTFSINPKIVCDPLSDYNVWSMLKPINTTGTtiKPDDRVWA 
ATRLDSRSFFWNV\APGAESAVASFVTQIAAAEAIiQKAPDVTTI* 
PRNVMFVFFQGETFDYIGSSRMVYDMEKGKFPVQIoENVDSFVBI, 
GQVALRTSLELWMHTDPVSQKNBSVRNQVEDIiLATEiEKSGAGVP 
AVILRRPNQSQPLPPSSLQRFLRARNISGWXJVDHSGAFHNKYY 
QSIYDTAENINVSYPEWI>EPI.KE/ETWNFG*QDTAKAI»ADVATV 

lgralyelaggtnfsdtvqadpqtvtrllygVflikannswpqs 

ILQGRDLRSYLG*RGLFQH\YIAV\SSPTNTIYV/VI.QyAIiANL 

TGTWNLTREQCQDPSKVPSBNKDLYEYSWVQGPLHSNETDRLP 

RCVRSTARLARALSPAFELSQWSSTEYSTWTESRWKDIRARIFIj 

lASKELELITLTVGFGILIFSLIVTYCINAKADVLFIAPREPGA 
VSY 

ALKCRPRRVLAILVGPVQPDRMAEEGAVAVCVRVRPLNSREESIi 
GETAQVYWKTHNNVI YPVDGS KSFNPDRVLHGNETPKNVYEA\I 
AAPI IDSAIQGYNGTIFAX YGC^N^GKTYTMMGSEDHLGVIPQ 

gqfhghpsqki*evfldrefli*rvsymeiynbtitdllcgtqkm 

KPLIIRSDVNRWVYVADLTEEWYTSEMALKWITKGEKSRHYGE 

TK^4NQRSSRSHTIPRMILESREKGEPSNCEGSVKVSHIiNLVDLA 

GSERAAQTQAAGVRLKEGCNINRSLFiriGQVIKKLSDGQVGGFI 

NYRDSKLTRILQNSr/SGNPKTRIICTITPVSFDETXiTALQFAST 

AKYMKNTPYVNBVSTDEALLKRYRKEIMDLKKQIiEEVSl^ETRAO 

A^tBKDQLAQLLEEKDLLQKVQMBKIENLTR^iLVTSSSLTLQQ3L 

KAKRKRRVTWCLGKINKMKNSNYADQFNIPTNITTKTHKI^SIML 

LREIDESVCSESDVFSNTLDTLSEIEWNPATKLLNQENIESELN 

SLRADYDNLVLDYEQLRTEKEEiy[ELKIiKEKND3LDEFEALERICTK 

KDQEMQLIHEISNLKNLVKHREVYNQDLENELSSKVELLREKED 

QIKKLQEYIDSQKLENIKMDLSYSLESIEDPKQMKQTLFDAETV 

ALDAKRESAFLRSENLELKEKMKELATTyKQMENDrQi:»YQSQr.E 

AK3CKMQVDt.EKELQSAFNEITKr>TSLIDGKVPKDLI*CNLEI*EGK 

ITDLQKEIiNKEVEEIJEAI^REEVILl^SBLKSLPSEVERIaRKEIQD 

KSEELHI ITSEKDKLFSEWHKESRVQGLLEEIGKTKDDIATTQ 

StmCSTDQEPQNFKTI*HMDFEQKYKMVLEENEHMNQEIVN3USKE 

AQKFDSSLGALKTEIiSYKTQELOEKTREVQERLNEMEQIJKBQLE 

NRDSPI^TVEREKTI,rTEKLQOTLEEVKTLTQEKDDLKQLQESL 

QrERDQLKSDIHDTVNMNIDTQEQCJtNALESriKQHQETINTLKS 

KISEEVSRNmMEENTGETlCDEFQQKMVGIDKKQDLEAKNTQTL 

TADVKDWEirEQQRKIFSLIQEKNEUXiMLESVIABKBOr.KTDI* 

KENIEMTIENQEELRLU3DELKKQQBIVAQEKNHAIiaCEGELSR 

TCDRIAEVEEKLKEKSQQLQEKQQQLLNVQEEMSEMQKKINEIE 

NLKNELKNKELTLEHMEXERLELAQKI^ENYEEVKS ITKERKVL 

KELQKSFETERDHLRGYIREIBATGLQTKEEIiKIAHIHLKEHQE 

TIDELRRSVSEKTAQIXNTQDLBKSHTiCLQEElPVLHEEQELLP 

NVKKVSETQETMNEXiEIilirEQSTTKDSTTLAR lEMERLRtiNEKF 

QESQEEIKSLTKERDin^IKEALEVKHDOLKBHIRETLAKIQE 

SQSKQEQSLNMKEKDNETTKIVSEMEQFKPKDSAUjRIEIBMLG 

LS KRLQESHDEMKSVAKEKDDLQRLQBVLQSESDQI.KENI KEI V 

AKHLBTEEELKVMCCLKEQEETIKrELRVNLSEKETEXSTrQKQ 

LEAINDKLQNKIQEIYEKEEQLNIICQISEVQBKVNELKQPKEHR 

KAKDSALQSIESKMLEIiTNRLQESQEEIQIMXKEKEBMICRVQEA 

LQIERDQLKENTKEIVAKMKESQEKEYQFLKMTAVNETQEKMCE 

lEHLKEQFETQKliNIiENIETENIRLTQXUlENLEEMRSVTKERlJ 

DLRSVEETLKVERDQLKENLRETXTRDLEKQEELKIVHMHLKEH 
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SEQ 
ID 
NO: 


pitrecSicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence: 


AiTu.no acid segment containing signal peptide — 
(A=Alanine, C=C/steine, D^Aspartlc Acid, E*= 
Glutamic Acid, F:= Phenylalanine, G=01ycine, 
H=Histidine, I==Xsoleucine, KssLysine, 
L=Leucine, M-Meth.i onine, N=Asparagiixe , 
P=Proline, Q^GIutamine, R^Arginine, 
S -Serine, T=« Threonine, V«=Valine, 
W-Tryptophan, Y«Tyroaine, X=Unknown, *=stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








QEriDKLRGIVSEKTNEISNMQKDLEHSNDALKAODliktQEEI^ 

lAHMHLKEQQETIDKLaGIVSEKTDKLSMMQKDLENSNAKUjEK 

IQELKANEHQLITLKKDVNETQKKVSEMEQLKKQIKDQSLTZiSK 

LEIENLNIAQKIiHENLHEMKSVMKERDNLRRVEETLKl»ERDQl.K 

ESLQETKARDIuEIQQELKTARMLSKEEKETVDKliREKlSBKTIQ 

ISDIQKDLDKSKDELQKKIQEIiQKKBLQIiRVKEDVNMSHKKIN 

EMEQtiKKQFEPNYLCKCEMDNPQLTKKIiHESLEEIRlVAKERDE 

LRRIKESLKMERDQFIATLREHIARDRQNHQVKPEKRLLSDGQQ 

HLMESLREKCSRIKELLKRYSEMDDHYECLWRLSLDLEKEIEFH 

RIMKKLKYVLSyVTKIKEEQHECINKFEMDFIDEVEKQKELLIK 

IOHLCX?DCDVPSRELRDLKXNQNMDLHIEElLKDFSESEFPSIK 

TEFQQVLSNRKEMTQFLEEWLNTRFDIEKLKNGIQKENDRICQV 

NWFFNNRl lAIMNESTEFEERSATISKBWEQDUCSLKEKKEKliF 

KNYQTLKTSIiASGAQVNPTTQDNKNPHVTSRATQLTTEKIRELE 

NSLHEAKESAMHKESKlIKMQKELEVTNDlIAKIiQAKVHESNKC 

LEKTKETIQVLQDKVALGAKPyKEEIEDLKMIOjGKIDLEICMKKA 

KEFEKBISATKATVEYQKEVIRLLRENLRRSQQAQDTSVISEHT 

DPQPSNKPLTCGGGSGIVQtmCAI,ILKSEHIRLEKEISKLK(3QN 

EOliXKQKNELLSNNQHLSNBVKTWKERTLKREAHKQVTCENSPK 

SPKVTGTASKKXQITPSQCKERNLQDPVPKESPKSCFFDSRSKS 

LPSPHPVRYFDNSSLGLCPEVQNAGAESVDSQP\GPWARLFQGK 

DVP\ECKTQ 


53X5 


23 


1460 


SELV^lWTVQNRESIjGLLSFPVMITMVCCAHSTNEPSNMSYVKEt 
VDRLLKGYDIRLRPDFGGP PVDV6MRI DVAS IDMVSEVNMDYTL 
myFQQSWKDKRDSYSGI PLMLTLD»RVADQI*WPDTYFIiNDKK 
S PVHGVTVKNIU4IRI,HPDGTVLYGLRITTTAACMIya>UlRyPLDE 
CNCTLEIES YGYTTDDI EP YWNGGEGAVTGVNKtEtiPQFS I VDY 
KMVSKKVEFTTGAypRLSLSPRLKRNIGYFILQTYMPSTXiITIL 
SWVSFWINYDASAARViU.GITTVI,TMTTlSTKI>RETIiPKIPyVK 
AIDIYLMGCFy)FyFLAI,r.EYAFVNYrFFGKGPQKKGASKQI>QSA 
NEKNKLEMNKVQVDAHGNILIiSTLElRNETSGSEVIiTSVSDPKA 
TMySYDSASIQYRKPLSSREVA*GRAPDRHGVPSKGRIRRRAS\ 
QLKYKXPDLTDVNSIDKWSRMFFPITFSLFNWYWLYYVH 




58X6 


861 


191 


TSSRSRAATiQEGDAETPGSVERRGRRAGAEDGMSQAPGAQPSPP ' 
TVYHERQRLELCAVHALNNVLQQQLFSQEAADEICKRLAPDSRIj 
NPHRS LI/STGNYDVWlMAALQGLGLAAVWWDRRRPLSQLAIiPQ 
VLGLILNLPSPVSr*GLLSLPLRRRKIiRWPCARi:»/VTV5YYKLDS 
K\l,RAPEGPGaLRTE\*GPFLAAAlAQGI.CBVIiLWTKEVEEKG 
SWLRTD 




58X7 


851 


118 


rlfrgpganrgrscrgcsggrepsggalpkrhcpc*ppsppaad 
vmsnttvpwapqansdsmvpyvlgpffliti.vgvwawmyvqk 
kkrvdrlrhhllpmysydpaeelheaeqellsdmgdpkwVqag 
rvatstsgchcwmsrrdltplphpsepgvldclapchzilipllsp 
gspcwviiglhfslhppsaasashaltitsbppgliipfvgveiita 
hpqalmgrgfpsgmaaagrhlcfl 




5818 


3 


3S18 


QAIiRDKLWIFIiVQSPyAVRHTESWKLMSTDOQQKIQAATlFDKGD " 

NKKMKSDGLGASGHSSSTNRNS INKTLKQDDVKEKDGTKIASKI 
TKELKTGGKm/SGK^PKTVTKSKTENQDKARLENMSPRQWERSA 
TAAAAATGQKNLLNGKGVRNQEGQISGARPKVLTGNIiMVQAKAK 
PLKKATGKDSPCLS I AGPSSRSTDSSMEFS ISTECI.DEPKBNGS 
TEEEKPSGHKLSFCDSPGQMMKNSVDSVKNSTVAIKSRPVSRVT 
NGTSNKKSIHEQDTNVNWSVLKKVSGKGCSEPVPQAILKKRGTS 
NGCTAAQQRTKSTPSNLTKTQGSQGESPNSYKSSVSSRQSDENV 
AKLDHNTTTEKQAPKRKMVKQVHTAr,PKVNAKI\^PKNLNQSK 
KGETTJ5NKDSKQKMPPGQVISKT0PSSQRPIiKHBTSTVQKSMFH 
DVRDNNWKDSVSEQKPHKPLINIASEISDAEA1jQSSCRP\DP0K 
PLlTOQEKEKLALECQNISKLDKSLKHELESKQlCXDKSBrKFPN 
KKETDDCDAAWICCHSVGSDNVKSKFYSTTALKYMVSNPNBNSL 
NSNPVCDLDSrSAGQIHLISDRENQVGRKDTNKQSSIKCV3DVi 
liCNPERTNGTI^NSAQEDiCKSKVPVEGLTI PS Kr.SDESAMDEDKH 
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VJSCXID- <WO 'MB'^12A1 I > 



wo 01/53312 



PCT/USOO/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
secjuence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acrd segment containing signal pept'x3[e~" 
(A:=Alanine, C=:Cysteine, D=Aspartic Acid, e= 
Glutamic Acid, F^Phenylalanine, G^Glycxne,^' 
HasHistidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P«Proline, Q=Glutamine, RssArginine, 
Ss=Serine, T-Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=On)cnown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«posslble nucleotide insertion) 








ATADSDVSSKCFSGQLSEKNSPKNMkTSESPESHETPETPFVGH"" 

WWX.STGVLHQRESPESDTGSATTSSDD1KPRSBDYDAGGSQDDD 

GSNDRGISKCGTMLCHDFIXSRSSSDTSTPEELKIYDSNU^IEVK 

MKKQSSNDbFQVNSTSDDEIPRKRPEIMSRSAIVHSRERENIPR 

GSVQFAQEIDQVSSSADETEDBRSE/yENVAEKFSISNPAPQQPQ 

GI INIiAFEDArENECREFSANKKFKRSVLLSVDECEELGSDEGE 

VHTPFQASVDSFSPSDVPDGISHEHHGRTCYSRFSRESEDNILE 

CKQNKI3NSVCKNESTVLDLSSIDSSRKNKQSVSATEKKNTIDVL 

SSRSRQLLREDKKVNNGSNVENDIQQRSKPLDSDVKSOERPCHL 

DLHQREPNSDIPKNSSTKSLDSFRSQVLPQEGPVKESHSTTTEK 

ANIALSAGDIDDCDTLAQTRMYDHRPSKTLSPIYEMDVIEAFEQ 

KVES3Il{VTDMDF*DDQHFAKQDWTi:,LKQLLSEQDSNI»DVTNSV 

PEDLSLAQYLrNQTLLIiARDSSKPQGITHIDTLNRWSBLTSPLD 

SSA5ITMASFSSEDCSPQGEWTILELETQH 


5819 
5820 


1 


5557 


AAAGLLGALHIiVMTLWAAARAEKEAFVQSESIIEVLRPDDGGL 

LQTETTLGLSSYQQKSISLYRGNCRPIRFEPPMLDFHBQPVQMP 

KMEKVYLHNPSSE*TITIiVSXFATTSHFHASFFQNRKILPGGNT 

SFDVS/VFLARWGNVENTLFINTSNHGVFTY\QVFX5VGVPNPy 

RliRPFLGARVTVNSSFSPIlNIHNPHSEPLQWEMYSSGGDUib 

ELPTGQQGGTRKLWEIPPYETKGVMRASFSSREADNHTAFIRIK 

TNASDSTEFIILPVEVEVTTAPGIYSSTEMLDFGTLRTQDLPKV 

LNLHLLNSGTKDVPITSVRPTPQ\NDAITVHFKPITIJCAS\ESK 

YTKVASISFDASKAKKPSQFSGKITVKAKEKSYSKliEIPYQAEV 

LDGYLGPDHAATLFHiRDSPADPVERPIYLTNTFSFAILIHDVt, 

LPEEAKTMFKVHNFSKPVliII^PNESGYIFTLLPMPSTSSMHrDN 

NILLITNASKFHLPVRVYTGPLDYFVLPPiaEERFIDPGVI^AT 

EASNILFAIINSNPIEIiAIKSWHIIGDGXLSIELVAVDRGNRTT 

IISSIiPECEKSSSSDQSSVTLASGYFVAVFRVKLTAKKLXEGIH 

DGAIQITTDYE1LTIPVK\AVIAVGSLTCSPKHVVLPPSFPGKI ^ V 

VHQSI.NIMNSFSQiCVKIQQIRSLSEDVRFYYKRLRGNKEDLEPG 

KKSKIANIYFDPGLQCGDHCYVGLPPLSKSEPKVQPGVAMQBDM 

WDADWDZ.HQSLPKGWTG I KEKSGHRLSAIFEVNTDLQKKI ISKI 

TAELSWPSILSSPRHLKFPLTNTNCSS \EEEITLENP /SODVPV 

YVQPIPIALYSNPSVFVDKLVSRFNI.SKV7UCIDIiRTriEFQVFRN 

SAHPLQSSTGFMEG\LSPHLIliNLir,KPGEKKSVKVK\FTPVHN 

RTVSSLI I VRmLTVMDAVMVQGQGTTENUlVAGKLPGPGSSLR 

FKITEALLKIXnrDSLKLREPWFTLKRTFKTOlTGQLQIHlETIE 

ISGYSCEGYGFKWNCQEFTLSAKASRD I IILFTPDFTASRVIR 

ELKFITTSGSEFVPrLNASLPYHMLATCAEALPRPNWEIALYII 

XSGIMSAI*FLLVIGTA\YLEAQGIWBP\FRRRLS\FEASNPPFD 

VGRPFDLRRIVGISSEGNLNTLSCDPGHSRGPCGAGGSSSRPSA 

GSHKQ*GPSGHPHSSHSNRNSADVDDVRAYNSGRTSSMTSAQAA 

SSQPAJTKTRPLVLDSNTGAQGHSAGRKSKGAKQSQHGSQHHAHS 

riiiiunJr'yt'fc'ijjr't'iJVPgPQEPQPERLSPAPIiAHPSHPERASSARH 

SSEDSDITSLIEAMDKDFDHHDSPALEVFTEQPPSPLPKSKGiCG 

KPLQRKVKPPKKQEBKEKKGKGKPQEDELKDSLADDOSSSTTTE 

TSNPDTEPLLKEDTEKQKGKQAMPEKKESEMSQVKQKSKKLLMI 

KKEIPTDVKPSSLELPYTPPI.ESKQRRNLPSKIPLPTAMTSGSK . 

SRNAOKTKGTSKLVDNRPPAIiAKFLPKSQEljGNTSSSBGEKDSP 

PPEWDSVPVHKPGSSTDSriYKLSLQTLNADXFLKQRQTSPTPAS 

PSPPAAPCPFVARGSYSSIVNSSSSSDPKIKQPNGSKHKLTKAA 

SLPGKNGNPTFAAVTAGYDKSPGGNGFAKVSSNKTGPSSSLGIS 

HAPVDSDGSDSSGLWSPVSNPSSPDPTPLNSFSAFGNSPMr»TGE 

VFSKLGLSRSCNQASQRSWNEFNSGPSYLWESPATDPSPSWPAS 

SGSPTHTATSVXX3NTSGLWSTTPFSSSIWSSNLSSALPFTTPAN 

TIASXGLMGTENSPAPHAPSTSSPADDI^TYNPWRIWSPTIGR 

RSSDPWSNSHFPHEK 




3X0 


1270 

] 

i 


tiVSLSGPVSLGVI,LCARSSTMGKRDNRVAYMMPIAMARSRGPIQ 
SSGPTIQX VX * XDQGLPGKK*KSN*KRKRK/DSKALAEFEEKMtf 
ESIWKKELEKHREKLLSGSESSSKKRQRKKKEKKKSW* \DSSSst 
3SSSDSSSSSSDSEDEDKKQGKRRKKKKNRSHKSSESSMSETBS 
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BNSDOCIO: <WO 0153312A1 J_> 



wo 01/53312 



PCT/USOO/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

c or re s po nding 

to first 

residue of 
amino acid 
sequence 


Ammo acid segment containing signal pepti'de" 
(A=Alanine, C=Cysteine, D=Aspartic Acid. 
Glutamic Acid, P=Phenylalanine, G-Glycine, 
E=Histidine, I = IsoIeucine, Ka=Lysine, 
L-Leucine, M-Methionine, W=Asparagine , 
P=Proline, Q*»Glutamine, R=Arginine, 
S-Serine , T-Threonine , V^Valine , 
W»Tryptophan, Y=Tyrosine, X=Unknown, *=stop 
Codon, /^.possible nucleotide deletion, 
\=possible nucleotide insertion) 








DSKDSLRKKKKSKIX3TEKEKDIK0LyJM<KKMvSEDKPLSSESLS^ 

eseyieevrakkkksseerekatektkkkkkhkkhskkkkboEaa 

SSSPDSP*H*EKSGFPYKESAMSEEISTViaTTYLLKCMNFIiVF 
GI I PGIiFSSHSDATV 


5821 


179 


915 


kwrnqswrwpkpgtnmmlscsvcwrrvtwtgsvwnirki.gkhpqt 
pt/ikdcsiaatgkrpsarfphqrrkkrremddgiaeggpqrsn 

TYVIiaFDRSVDLAQFSENTPLYPICRAWMRNSPSVRERECSPS 

splpplpedeeg\sevtnsksr*cvqacppthtpqgqpknacr\ 

SRIPSPIAALRMQGTP*RWSPFEPEPSPSTLIYRNMQRWKRIRQ 
RWKEASHRNQLRYSESMKrLREMYERO 


5822 


464 


4379 


UTLKEMPIVMARPLESTASSSEDEEVISQEDHPCXMWTGaCRRI '• 

PVLVFHADATLTKDNNlRVIGEKyHLSYKIVRTDSRLVRSILTA 

HGFHEVHPSSTDYNLMWTGSHLKPPLLRTLSEAQKVNHFPRSYE 

LTRKDRLYKNIIRMQHTHGPKAFHILPQTFLLPAEYAEFCNSYS 

KDRGPWIVKPVASSRGRGVVYLINNPNQISIjEEMILVSRYINNP 

IiLIDDFKFDVRLYVLVTSYDPLVIYliYEEGrARFArVRYDQGAK 

NIRNQPMHIiTNYSVNKKSGDYVSCDDPSVEDYGKKWSMSAMLRY 

IjKQBGRDTTALMAHVEDLI I KTIISASXAIATACKTFVPHRSSC 

felygfdvlidstlkpwllevni.spslacdap!:jch,kikasmisd 

MFTWGPVCQDPAQRASTRPIYPTFESSRRNPPQKPQRCRPLSA 

sdaemkni^vgsarekopgiclggsvlglsmeeikvlrrvkeendr 
rggfirifptsetwexygsylehktsmmymlatrlfqdrmtadg 
APELKi * si^skaklhaalyerkij:,slevrkrrrrssri,ramrp 
kypvitqpaemnvxteteseeeeevau>nedeeqeasqebsagf 

LRENQAKYTPSLTALVEin'PKENSMKVREWNKKGGHCXrKLETQE 

lepkfnlmqilqdngnlskmqariafsaylqhvqi\rlmkdsgg 

QTFSASWAAKEDEQMELWRFLKRASNNLQHSIiRMVtiPSRRIAI* 

LERTRIZJVHQIiGDFIlVYNKETEQMAEKiCSKKKVEEEEEDGVNM 

ENFQEFIRQASEAELEEVLTFYTQKNKSASVFLGTHSKISKNNN ^ 

KYSDSGAKGDH:pETIMEEVKrKPPKQQQTTEIHSDKLSRFTTSA 

EKEAKLVYSNSSSGPTATLQKIPNTHLSSVTTSDLSPGPCHHSS 

LSOIPSAIPSMPHQPTILIiNTVSASASPCIiHPGAQNIPSPTGLP 

RCRSGSHTIGPFSSFQSAAHIYSQKLSRPSSAKAGSCYriNKHHS 

GIAKTQKEGEDASI.YSKRYNQSMVTAEI.QR1AEKQAARQYSPSS 

HINLItTQQVTNLNLATGIINRSSASAPPTLRPIISPSGPTWSTO 

SDPQAPENHSSSPGSRSUJTGGFAWEGEVENNVYSQATGWPQH 

KYHPTAGSYQLQFALQQI^EQQKLQSRQLLDQSRARHQAIFGSQT 

LPNSNLWTMNNGAGCRISSATASGQKPTTLPQKWPPPSSCASL 

VPKPPPNHEQVLRRATSQKASKGSSAEGQLNQIKJSSLNPAT^FVP 

ITSSTDPAHTKIMNHKHTEKQPVHHSWVHD 


S823 


42 


2293 

< 
] 

; 


LLTALSMEGGOGRDEPSACRAGDVmODPKKEDILLLADEKFDF 
BLSLSSSSANEDOBVFFGPFGHKERCIAASLELNNPVPEQPPLP 
TSESPFAWSPi:*AGEKFVEVyKEAHLU;l*HIESSSRNQAAQAAKP 
fiOPRSQGVERFIQESKP\ KINX*FEKEKEMKKS PTSLKRETYYLS 
DSPLLCPPVGEPRHASSPALPSSGAQARLTRAPGPPHSAHALP 
RESCTAHAASQAATQRKPGTiaLLPRAASVRGRGI PGAAEKPKK 
EIPASPSRTKIPAEKBSHRDVLP0KPAPGAVNVPAAGSHLGQGK 
RAIPVP\NKLGLKKTLLKAPGSYSN\LQRKSSSGA\VWSGASSA 

ctpqpvakakssefasipan*lpgu:pnisks\grmgpamlrpa 

I.\PflGPVG\ASSWQAKRVDVSEriAAEQI.TAPP\SASPTQPQTPE 

gggXqwlnsscawsessquwktrsirrrdso^sktkvmptptn 

QFKIPKFSIGDS\PDSSTPJCLSRAQRPQSCTflVGRVTVHSTPVR 

rssgpapqsllsawrvsalptpasrrcsglppmtpktmpravgs 

PL\CVPARRRSSEPRKNSAMRTEPTRESNRiCTDSR\LVDVSPDR 

3SPPSRVPQALNFSPEBSDSTFSKSTATEVAREEAKPGGDAAPS 

SAXjIiVDIKLEPLAVTPDAASQPLIDLPLIDFCDTPEAHVAVGSE 

3RPLIDLMTNTPDMNKNVAKPSPWGQLIDLSSPL.IQT.SPEADK 
SNVDSPLIiKF 


5824 


42 


2293 J 

1 


jLTALSMEGGGGRDEPSACRAGDVNi^DPiCKEDILLL^ 

DLSLSSSSANEDDEVFFGPPGHKERCIAASIiEUJNPVPEQPPlI 

CSESPFAWSPIAGEKFVEVYKEAHLIALHIESSSRNQAAQAAKP 
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V^SOOCID: <WO 0153312A1J_> 



wo 01/53312 



PCT/US00/342<J3 



SEQ 
ID 
NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo aciq segment containing signal peptiai* 
(A=Alanine, C-Cysteine, D=Aspartic Acid, 
Glutamic Acid, F^Phenylalanine, G=Glycine, 
H=Hj.stidine, I«Isoleucine, K-Lysine, 
L==Leucine, M-Methionine, N-Asparagine 
P^Proline, Q^Glutamine, R«Arginine, 
S=.Serine, Tx=Threonine, V==Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknovm, ^^Storf 
Codon, /^possible nucleotide deletion, 
\*possible nucleotide insertion) 




5825 






EDPRSQGVERFl6ESKb'VKINLl-'KkU'KJ.'MkktiLHi'yi var:"r>v^ 

DSPLLGPPVGEPRLIA3SPAI,PSSGAQARLTRAPGPPHSAHM.P 

RESCTAHAASQAATQRKPGTKLLLPRAASVRGRGIPGAAEKPlCK 

EIPASPSRTKIPAEKESHRDVLPDKPAPGAVNVPAAGSHIiGQGK 

RAIPVP\MKLGLKICTLLKAPGSYSN\LQRKSSSGA\VWSGASSA 

CTPQFVAKAKSSEFASrPAN*LPGIX:PNISKS\GRMGPAMLRPA 

L\PAGFVG\ASSWQAKRVDVSfiLAAEQLTAPP\SASPTQPQTPE 

(SGGNQWI^SSCAWSESSQCJJKTRSIRRRDSCrj^SKTKVMPTPTN 

OFKIPKFSIGDS\PDSSTPKLSRAQRPQSCTSVGRVTVHSTPVR 

RSSGPAPQSLLSAWRVSALPTPASRRCSGI.PPMTPKTMPRAVGS 

PL\CVPARRRSSEPRKNSAMRTEPTRESNRKTDSR\LVDVSPDR 

GSPPSRVPQALNFSPEESDSTFSKSTATEVAREEAKPGGDAAPS 

EALLVDIKI,EPLAVTPDAASQPLlDI,PriIDFCDTPEAHVAVGSE 

SRPLIDIiMTKTPDMNKNVAKPSPWGQLlDLSSPLIQLSPEADK 

ENVDSPLI.KF 




5826 


2 


4216 


t-LQXESASPAi^FSSGFLAAHPHSPGGSLATKGRSRLSAPGMUIli 
SAAPPAPPPEVTATARPCXCSVGRRGDGGKMAAAGALERSFVEI, 
SGAERBRPRHFREFTVCS IGTAtrAVAGAVKYSESAGGFYYVESG 
KLPSVTRNRFIHWKTSGDTLELMEESLDIEnbLNNAIRLKFQNCS 
VLPGaVYVSETQNRVirrJ4LTNQaimRI*LLPHPSRMyRSELWD 
SQMQSIFTDIGXVDFTDPCWYQLIPAVPGISPNSTASTAWLSSD 
GEALFALPCASGGIPVLKI>PPYDIPGMVSWELKQSSVMQRLLT 
GWMPTAIRGDQSPSDRPLSLAVHCVEHDAFIFAI^QDHKLRMWS 
YKEQMCLMVADMLEyvpVKKDLRI,TAGTGHiaRtAYSPTMGLYIi 
GIF\MHAPKRGQFCIFQI,VSTBSNRYSLDHISSLFTSQETLI0F 
ALTSTDIWAI>WHDAENQTVVKYINFEHNVAGQWNPVFMQPLPEE 
EIVIRDDQDPREMYLQSLPTPGQFTNEALCKALQIFCRGTERKI, 
DLSWSEIUKKEVTIAVENELQGSVTEYEFSQEEFRNLQQEFWCKF 
YACCliQyQEAI^HPIAIJ{LNPHTJnWCXiKKGYIiSFI.IPSSLVD 
HLYLLPYEKLIiTEDETTISDDVDIARDVICXIKCLRLIEESVTV 
DMSVIMEt^CYJJI^SPEKAAEQlLEDMITIDVENVMEDlCSKLQ 

EIRWPIHAXGLLrREMDYETEVEMEKGFNPAQPLNIRMNI.TQLY 
GSNTAGYTVCRGVHKIASTRFIiICRDI.riTT,nnT t MT>r r^T\-k^rTti^ 

TGOt,FQAQQDIiLfIRTAPLLLSYYr,IKWGSECXATDVPLDTLBSN 
LQHLSVLELTDSGALMANRFVSSPQTIVELFFQEVARKHIISHL 
FSQPKAPLSQTGIJrWPEMITAITSYLlK3LI,WPSNPGCXPLECLM 
GNCQYVQLQDYIQLlJiPWCQVimSSCRFMLGRCYLVTGEGQKAL 
ECFCQAASBVGKEEFbDRLXRSEDGElVSTPRLQYYDKVIJlLtJD 
yiGLPELVIQLATSAITEASDDW\KSQATI,\RTCIFKHHL\Dm 
\HN£QAYGSL*PQIPDSSRQLDCLRQLVVVLCERSQU30I«VEFS 
YVNLHMEWGI lESRARAVDLMTHNYYEUJYAPHIYRHNYRiCAG 
TVMFEYGMRU3RE\mTLRGbBKC5GNCyiAAIiKCIjRLIRPEYAWX 
VQPVSaAVYDRPGASPKRNHDGECTAAPTNRQIEXLELEDLEKB 
CSLARlRLTLAQHDPSAVAVAGSSSAEEMVTtLVQAGLFDTAXS 
tiCQTFKLPt.TPVFEGlAPKCriCLQFGGEAAQABAMAWLAANQX,S 
SVXTTKESSATDEAWRUQSTYLERYKVQNNtiYHHCVINKLLSHG 
VPr,PNWJbINSYKKVDAAELLRLyLNyDIJ^DX.TPyQVrRlCGC 




5827 


3 


871 


itbUijl4RDHSAPPPKPCTSVGAMGC*PRQ/sPKEQQRQt»KKQKNR" 

AAAQRSRQKHTDKADAIJlQQHESLEKDNLAriRKEIQSLQAEIAW 

WSRTIJIVHERLCPMDCASCSAPGLLGCWDQABGLIiGPGPQGQHG ' 

CREQLELFQTPGSCYPAQPLSPGPQPHDSPSIXQCPLPSLSLGP 

AWAEPPVQLSPSPLLFASHTGSSliQGSSSKLSALQPSLTAQTA 

PPQPLELEHPTRGKLGSSPDNPSSAIX5LARLQSREHKPALSAAT 

rfQGLWDPS PHPLIAFPIaLSSAQVHF 




19A 


2287 ( 
] 
1 

i 

1 
< 

•3 


3MGSENSAL:<SYTIiREPPFTI.PSGtAVYPAVLQDGKFAS'VFVYK 
lEMEDKVKKAAKVP* *HLKTLRHPCLLRFLSCTVEADGIHLVTE 
WQPUEVAliETtiSSAEVCAGiyDIliLALIFI>HDRGHLTHNNVCL 

3svfvskdghwklggmetvckvsqatpeflrs iqs irdpasi pp 
:emspefttlpechghardafsfg'plveslltilneqvsadvls, 

!FQQTI.HSTI,LNPIPKWRPALCTU:*SHDFFRNDFI.EVVNPLKSB 
^LKSEEEKTEPFKFr^DRVSCLSEELXASRLVPLI.LNQLVFAEP 
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A 



wo 01/53312 



PCT/USOO/34263 



SEQ 
ID 
NO: 



5828 



5829 



583 0 



S63X 



5032 



PredicteH 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



71 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



■■■■i259- 



3139 



2897 



2454 



829 



Amino acid segment containing signal peptide 
(A=Alanine, C^Cysteine, D=Aspartic Acid, E=^ 
Glutamic Acid, F« Phenylalanine, G-Glycine, 
] H«Histidlne, I=Isoleucine, Ks^Lysine, 
L=Leucine, M=Methionine, N=sAsparagine» 
P=Proline, Q=Glut amine, R^Arginine, 
t SctSerine, T«Threonine, V»Valine^ 
W-Tryptophan, Y-Tyrosine, X=Unknown, *^Stop, 
Codon, /«poasible nucleotide deletion, 
\=possible nucleotide insertion ) 
VAV\KSFLPYLLGPKKDHACK3BTPCIiL SPALF^SRVIPVlaLQrF 
EVHEEHVRMVLLSHIb'AYVGALSI>REQr,KKV\IL\PQVLLG\tR 
D\TSDSIVAITLHSIiAVLVSr.LGPEWVGGERTKIPKRTAP\SP 
TKXNTDLSLEGDPPSQPIKFPINGLSDVKNTSEDSENFPSSSiOC 
SEEWPDWSGPE\EPENQTVNI\QIWP\REP\CDDVKSQCTTLDV 
EESSWDDCBPSSLDTKVNPGGGITATKPVTSGEQKPIPArj:>SLT 
EESMPWKSSLPQKISIiVQRGDDADQIEPPKVSSQERPLKVPSEb 
GIiGEEFTIQVKKKPVKDPEIMDWFADMIPEIKPSAAFLILPELRT 
EMVPKKDDVSPVMQFSSKFAAAEITEGEAEGWEEEGELNWEDNN 

AREGGSLGAVAACXSEI^SYSCDFCPARPHTSWLTRFVKMEFQAW 
MAVGGGSRMTDIiTSSIPKPLLPVGNKFLIWYPLKLLERVGFEEV 
IVVTTRDVQKALCAEFKMKMKPDIVCIPDDADMGTADSIjRYlyp 
KLKTDVI.VLSCDI.I TD VAIiHEWDLFRAYDASLAMLMRKGQDS I 
EPVPGQKGKKKAVEQRDFIGVDSTGKRI*I,riyiANEADLDEEI*VIK 
GSILQKHPRXRFHTGLVDAHLYCLKKYIVDPLMENG\SITS1RS 
EL\I PYLV/RGKQFSSASSQQGTRKEKEGGSKOKRGLKSFRISY 
SFY*KEANYTGTGAPY\D\ACWI 



PDGRH VS CSE0KT I KI WDTTNKQCVNNFSDS VGFANFVDFNPS 
GTCIASAGSDQWKVWDVRVNKLLQHYQVHSGGVNCISFHPSGN 
yLITASSDGTLKILmiLKGRLIYTr^GHTGPVFTVSFSKGGEI.F 
ASGGADTQVLLWRTNFDELHCKGLTKRNLKRliHFDSPPHr^LDIY 
PRTPHPHEEKVErVEDFFLHLriRI,IQSIA*SICRSLLPLLWISP 
LLILPQQQKPWGLCQ^RVKRPVDIS*TLP*CHQNVCQQPRKRK 
QKT*VTSPVKVK/VSIPIAVTnALEHIMEQl»WVLTQTVSlLEQR 
I LTLtEDKLKDCLBWQQKLFSAVQQKS 



GGKMAAPBERDI^TQEQTEKLLQFQDLT GIESMDQa^rniEQHNW 
NIEAAVQDRLNEQEGVPSVFNPPPSRPLQVWTADHRIYSYWSR 
PQPRGLLGKGYYIilMLPFRFTYYTIIiDIFRFALRFIRPDPRSRV 
TDPVGDIVSFMHSPEEKYGRAHPVFyQGTYSQAU*DAKREr,RFI* 
LVYLHGDDHQEi^bEFCRNTLCAPEVrSLINTRMLFWACSTNKPE 
GYRVSQALRENT YPFLAMI MLKDRRE * P V\ VGRLBGLI \QPBDL 
INQLTFIMDAKQTYIiVSERLEREERNQTQVl,RQOQDEAYl*ASIiR 
ADQEKERKKRBERERKRRKKEEVQQQKLAEERRRQNLQEEKBRK 
liECIiP PEPSPDDPES VKI I FKLPNDSRVERRPHPSQSIiTVIHDF 
IiFSLKESP\EKFQlEA\NFPRR\VLPCIPSEE\WPNPPTIK?E\A 
I GLSHTEVIjFVQbLTDE 



FCSKDKCCLYLPDSIMRSKSCrAKP GAHSQDRHAVMDSBRQVKD 
TDDIESPKRSIRDSGYIBCWDSERSDSLSPPRHGRDDSFDSLDS 
FGSRSRQTPSPD\AnJ2GSSDGRGSDSESDLPHRKr,PDVKKDDMS 
ARRTSHGEPKSAVPPNQYLPNKSNQTAYVPAPLRKKKAEREEYR 
KSWSTATSPAGLGKKALQDYGPRT\PVS\DDAESTSMFDMRC3E 
EAAVQPHSRARQEQLQLINNQLREEDDKWQDDIARWKSRKRSVS 
QDI^IKKEEERKKMEKLLAGEDGTSERRKSIKTYREIVQEKBRRE 
REIiHEAYKNARSQEEAEGILQQYIERFTISEAVLERIiEMPKrLE 
RSHSTEPWt^SSFIiNDPKPMKYIaRaQSLPPPKFTATVETTIARAS 
VLDTSMSAGSGSPSKTVTPKAVPMLTPKPYSQPKNSQDVLKTPK 
VDGKVSVNGETVHREEEKERBCPTVAPAHSLTKSQMFBGVARVH 
GSPLELKQDNGSIEINIKKPNSVPQELAATTEKTEPNSQEDKND 
GGKSRKGNIELASSEPQHFTTTVTRCSPTVAFVEFPSSPQLKND 
VSEEKDQKKPENEMSGKVELVLSQKWKPKSPEPEATLTFPFIjD 
KMPEANQLHLPNLNSQVDSPSSEKSPVTTPFKFWAWDPEEERRR 

qekwqqeqbrllqbryq\keqdk\lkee\wekaqkevbeebrry 

YEEEP*II\EDPWPFTVSSSSADQLSTSSSMTEGSGTMNiaDL 

gncqdekqdrrwkkspqgddsdlllktrbsdrlbekgsltegal 
ahsgnpvskgvhedhqiidteagaphcgtmpqraqpps^qqtsh 

PTHSSEDVKPKTLPLDKSINHQIESPSERRKSlSGKKLCSSCGIj 
PLGKGAAMIIETLNI*yFHIQCFRCG\lCKGQIX5DAVSGTDVRIR 
NGLLWCNDCYMRSRSAGQPTTL 

PGRRFRHGSCAFQKQCIMI/HICQYFLQGECKFGTSCKRSHDFSil " 
SENLEKr.EKLGMSSDLVSRLPTIYRMAHDIKWKSSAPSRVP Pl.F 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor r e spor.d i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide — 
(A=rAlanine, C^Cysteine, D=sAspartic Acid, E= 
Glutamic Acid, F^Phenylalanine, G=:Glyclne. 
H=Histidine, I=Isoleucine, K^Lysine^ 
Ii=Leucine, M=:Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=:Threonine, V^Valine, 
W-Tryptophan, y«»Tyrosine, X=Unknown, *==Stc^. 
Codon, /=po33ible nucleotide deletion, 
\~possible nucleotide insertion) 








VPQGTSERKDSSGSVSPNTLSQEEGDQICLYHIRKSCSFQDKCH 
RVHFHIiPyKWQb'bDHGKWEDIiDNMELIEEAYCWPKlERILCS&S 
ASTFHSHCLNFNAMTYGATQARRLSTASSVTKPPHFILTTDWIW 
YWSDEFGSV7QEYGRQGTVHPVTTVSSSDVEKAyLAY/WYTGV*R 
PGSHLEVPGRKAQLRVRPQSLRSEKPGLWHN*KGLPQTQXR\AP 
QDVTTMQTCWTKFPGPKSIPDYWDSSAIiPDPGFQKITLSSSSEE 
YQKVWNliFNRTLPFYFVQKIERVOKLALWEVYQWQKGQMQKQNG 
GKAVDERQLPHGTSAIFVDAICQQNFDWRVCGVHGTSYGKGSYF 
ARDAAYSHHYSKSDTQTHTMFLARVLVGEFVRGNASFVRPPAKE 
GWSKAPYDSCVWSVSDPS I FVIFEKHQVYPEYVIOYTTSSKPSV 
TPSILLALGSLFSSRQ 


5833 


170 


3289 


SILCLLSPC WQFGKPWS I liSSRSRHSPCTKKGWEGMRKHLHT 

RQGHK* VHVEIS KALWVYRDDYFIRHS IS VSAVI VRAWITHKYR 

GRDWNVKWEElTXiLHAVAKNYTLLQTI PPFERPFKDHQVCLEWNM 

GY 1 WNLRANRI PQCPLEND WALLGFPYTVSSGENTGIVKKFPRF 

RNREI>EATRRQRMDYPVFTVSLWLYI.rjHYCKANLCGILyFVDSN 

EMYGTPSVFLTEEGYLHIQMHLVKGEDIAVKTKFIIPLKEWFRL 

DISFWGGQIWTTSIGQDLKSYHNQTISFREDFHYNDTAGYFII 

GGSR YVAG IEGFFGPLKYYRU^SLHPAjQIFNPLLEKQLAEQI KL 

Y YERCAEVQEI VS WASAAKHGGERQEACHLHNSYUDLQRRYGR 

PSMCRAFPWEKELKDKHPS LFOALliEMDLiIiTVPftMnMPCiVc vrr* 

GKIFEKAVKRLSS IDGIiHQISS IVPFIiTDSSCCGYHKASYYLAV 

FYETGLWVPRDQLQGMLYSLVGGQGSERLSSMMIiGYKHYQGIDN 

YPX,DWELSYAyYSWIATKTPI.tX3HTLQGDQAYVETIRT,KDDEri* 

KVQTKEDGDVFMWLKHEATRGNAAACK3RIJy3MLFWGQQGVAKWP 

EAAIEWYAKGALETEDPALIYDYAIVLFKGQGVKKNRRLALELM 

KKAASKGLEQAVNGLGWYYHKFKKNYA\KAAKYWLKA\EE\KGN 

PDASYNLGVI*HLDGIFPGVPGRNQTLAGEYFHKAAQGGHMEGTIj 

WCSI^YYITGNIiETFPRDPEKAWWAKHVAEKNGYLGHVIRKGLM ^ 

AYI»EGSWHEALLYYVl*A7^ErGIEVSQTNLAHICEERPDI*ARRYL 

gvncvwryynfsvfqidapsfaylkmgdlyyyghqnosqdle:*s 
vqmyaqaaldgdsqgffniiaiiliebgtiiphhildfleldstiji 

SNNIS ILQELYERCWSHSNEESFS PCSIiAWLYLHLRLLWGAILH 
SALIYFLGTFLLSILIAWTVQYFQSVSASDPPPRPSQASPDTAT 
STASPAVTPAADASDQDQPTVTNNPEPRG 


"" 5834 


17 


4020 

1 ] 


RFRRGGGRVFPGAFPASPSDSIjGQGNSQGPPRTPKPPRT/'QECG" 

SAAPGP I PGQS33 * VPLRI,EQIQQKADCPLSr.EIxALKPRMAAQV 

TLBDALSNVDLI^EELPI^PDQQPCIEPPPSSLLYQPNFNTKFEDR 

NAFVTG lARYI EQATVHSSMNEMLEEGQEYAVMLYTWRSCSRAI 

PQVKCWKQPrmVEI yEKTVEVLEPBVTKU4NPMYFQRNAI ERFC 

OEVRRLCHAERRKDFVS EAYLITLGKF INMFAVMELKNMKCS V 

KNDHSAYKRAAQFI.RKMADPQSIQBSQNLSMPIiAlJHNKIXQSLQ 

QQLBVISGYEELlADrVNlfCVDYYENRMYLTFSEKHMLL 

GLYLMIXSSVSNIYKLDAKK^XNI^KIDKYFKQI^VVPriFGDMQI 

BIARYIKTSAHYEENKSRWTCTSSGSSPQYNICEQMIQIRBDHM 

RFISEIARySNSEVVTGSGRQEA0KTDAEyRKI;FDLALQGLOI,L 

SQWSAHVMEVYSWKLVHPTDKYSNKDCPDSAEEYERATRYWYTS 

ccn.i'AijvavijflMi K.i»j^vij^GRMESVFNHArRHTVYAATfQDFSQ 

VTUVIEPLRQAIKKKKNVIQSVI^AlRKWaJWETGHEPFimP^ 

RGEKDPKSG*D I KVPRRAVGPSSTQLYT4VRIMLESLIADKSGSK 

KTLRSSLEGPTILDIEKPHRESPFYTHLINPSETLOQCCDLSQI, 

WFREFFLELTMGRRIQFPIEMSMPWILTDHILETKEASMMEYVL 

ySlJ)LYWDSAHYAI*TRFNKQFLYDEIEAEVNLCFDQFVYKIiADQ 

IFAYYKVMAGSIJ.U3KRLRSECKNQGATIHLPPSNRyETLLKQR 

HVQLLGRSrDLNRLITQRVSAAMYKSr,EIiAIGRPESEDLTSIVE 

LDGLLBIKfRMTHKIiXiSRYLTLDGFDAMFREANHNVSAPYGRITL 

HVFWEUsTYDFLPNYCYNGSTMRFVRTVIjPFSQEFORDKQPKAQP 

QYLHGSKALNLAYSSIYGSYRNFVGPPHFQVICRI,LGYQGIAW 

MEBLLKWKBLUXSTILQYVXTtiMEVMPKICRLPRHEYGSPGIl, 

EFFHHQLKDIVEYAELKTVCFQNXiREVGNAILFCLLIEQSLSLE^ 

EVCDLLHAAPFOKIIjPRVHVKEGERI*DAKMKRI*ESKYAPLHLVP 
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ID 
NO: 


beginning 
nucleotide 
location 
cor re spon di ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to ' first 
amino acid 
residue of 
amino acid 
sequence 


Amano acid segment containing signal peptide 
^A=Alanine, c^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G^Glycine, 
H=Histidine, I=lsoleucine, K=Lyeine, 
LssLeucine, Mi^Methionine, N-Asparagine , 
P^Proline, Q=Glut amine, R=Arginine, 
S=Serine, T^Threonine, V=Valine, 
W=Tryptophan, Y^^Tyrosine, X«Unknown, *s=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








HERLGTPQQIAIAREGDLLTKERLCCGI^MFEVILTRlRSFiD 
DPI WRGPLPSNGVMHVDECVEPHRLWSAMQFVYCI PVOTHEF^TV 
EQCFGDGLHWAGCMI IVLLGQQRRPAVLDFCYHLLKVQKKDGKD 
EIIKNVPLKKMVERIRKFQI LNDE I ITILDKYLKSGDGEGTPVE 
HVRCFQPP IHQSLASS 


583S 


4209 


1904 


SGNI RMAQGSHQI DFQVI.HDLRQKFPEVPEVWSRCMLQKNNKL ' 

DACCAVLSQESTRYLYGEGDliNFSDDSGISGLRNHMTSI/KLDLQ 

SQNIYHHGREGSRMNGSRTLTHSISDGQLQGGQSNSELFQQEPQ 

TAPAQVPQGFNVFG.viSS S SGASNSAPHLGFHIX3SKGTSSLSQQT 

PRFNPIMVTLAPNIQTGRNTPTSLHIUGVPPPVLNSPQGKSIYI 

RPYITTPGGTTRQTQQHSGWVSQFNPMNPQQVYQPSQPGPWTTC 

PASKPLSHTSSQQPNQQGHQTSHVYMPISSPTTSOPPTIHSSGS 

SQSSAHSQYNIQNISXGPRKNQIEIKLEPPQRNNSSKIiRSSGPR 

TSSTSSSVNSQTIjNRNQPTVYIAASPPNTDELMSRSQPKVYISA 

NAATGDEQVMRNQPTLFISTNSGASAASRNMSGQVSMGPAPIHH 

HPPKSRAIGNNSAT3PRVVVTQPNT\EYTFKITVSPNKPPAVSP 

GWSPTFELTNLIaKHPDHYVETENIHHI.TDPTLAHVDRlSETRK 

liSMGSDDAAYTQDI *RISNS WIiGMVAHACNSSALGGQDGRII*A 

QEFETSWGNIWRLRI,YRRP*NYAGMVAHTCSPSYSVD*AtjLVHQ 

KARMERIiQRELBIQKKKIiDKLKSEVNEMENMLTRRRIjKRSNSIS 

QIPSLEEMQQLRSCNRQbQIDlDCIiTKEXDIiFQARGPHFNPSAI 

HNFYDNIGFVGPVPPKPKDQRSIIKTPKTQDTEDDEGAQWNCTA 

CTFIiNHPALIRCEQCEMPRHP 


583 6 


361 


2303 


FHITMCGICCSVNFSAEHFSQDLKEDLLYNLKQRGPNSSKQI.LK 
SDVWYQCLFSAHVLHJORGVLTTQPVEDERGNVPLWNGEIFSGIK 
VSAEENDTQILFNYLSSCKNESBIIiSLFSEVQGPWSFIYYQASS 
HYI1WFGRDFFGRRSZ.LWHFSNLGKSFCLSSVGTQTSGLANQWQE 
VPAS\DFSELILSLXjSFPDAIiFYNCILGNIFLGRILLKKMIiIA* 
v:<FWrYQHIiYQR*OMKPNCILKNIiIiFli* r*CCHKLH¥?RLIAVI 
FPMCHLaERyF?CSFLU«rYT*KEVlQQFIDVl,SVAVKKRVLCLPR 
DENLTANEVUCTCDRKANVAILFSGGIDSMVIATLADRHIPLDE 
PXDIjLWVAPIAEBKTMPTTFWRSGNKQKWKCEIPSEEFSKDVAA 
AAADS PNKHVSVPDR 1 TGRAGLKELQAVSPSRI WNFVEINVSME 
EIXJKLRRTRICHZjIRPIiDTVLDDSIGCAVWFASRGIGtVLVAQEG 
VKSYQSNAKWLTGIGADEQIiAGYSRHRVRFQSHGLEGIjNKEIM 
MEUSRISSRNI/SRDDRVIGDHGKEARFPFIiDEtfWSFIiKStjPIW 
EKANLTItPRGIGEKLLtiRLAAVSLGLTASAIiriPKRAMQFGSRIA 
KMEKINEKASDKCGRLQIMSIiENLSIBKETKL 


S837 


4792 


903 


NGNAVAQAPVTNCCYLATGSKDQTIRIWSCSRGRGVMILKLPPL 

KRRGGGIDPTVKERLWLTLHWPSNQPTQIiVSSCFGGELLQWDLT 

QSWRRKYTIiFSASSEGQKHSRIVFNDGPLQTEDDKQr»I*LSTSMD 

RDVKCWDIATLECSWTLPSLGGFAYSIAFSSVDIGSIiAIGVGDG 

MIRVWNTLS IKNNYDVKWFWQGVKSKVTALCWHPTKEGCIAFGT 

DDGKVGriYDTYSNKPPQISSTYHKKTVYTIAWGPPVPPMSLGGE 

GDRPSLALYSCGGEGrVLQHNPWKLSGEAPDIJJKLIRD'mSIKY 

KLPVHTEI SWKADGKIMALGNEDGSIEI FQ\ IPNLKI.2 CTIQQH 

Hitt VNTIS WHIIE \HGS PAQKLS YL\MPSGSQQC2 PFTCHWLKKC 

P* KAAPBS PSDPLQSP YRTPPQGHTAQDYPVWAWEPHIH* WEGL 

VFCFPIDGYSPGCWD\AFPGKEAPVAIFRG\HQGRLLCVAWSPI. 

DPDCXYSG\ADDFCVHKWLTSMQDHSRPPQGKKSIBLEKKRLSQ 

PKAKPKKKKKPTLRTPVKLESIDGNEEESMKENSGPVENGVSDQ 

EGEEQAREPELPCGliAPAVSREPVICTPVSSGFEKSKVTINNKV 

ILLKKEPPKEfCPETLIKKRKARSLLPLSTSLDHRSKEELHQDCL 

VliATAKHSRELNEDVSADVEERFHLGLFTDRATLYRMir>XSGKG 

HLENGHPELPHQIiMLWKGDLKGVLQTAAERGELTDNLVAMAPAA 

GYHVWLWAVEAFAKQIiCPQDQYVKAASHLLSIHKVYEAVEIiliKS 

NHFYREAIAIAKARLRPSDPVLKDLYLSWGTVLERDGHYAVAAK 

CYLGATCAYPAAKVIxAKKGDAASbRTAAELAAIVGEDEIiSASLA 

LRCAQELLIJi^r^rwVGAQ2ALQI^ESLQGQRLVFCLLEIaiSR^ - 

EKQLSEGKSSSSYHTWNTGTEGPFVERVTAVWKSIFSLDTPEQ? 

QEAFQKLQNI KYPSATNKFTPAKQLLtiHI CHBLTIAVLSQQMZVS W 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
seq^ience 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, K^. 
Glutamic Acid, F-Phenylalanine^ G=iGlycine, 
H=Histidine, I=Isoleucine, K«Lyeine^ 
L-Leucine, M-Methionine, N*>Asparagine, 
P-Proline, Q^Glutamxne, R=Arginine, 
S-Serine, T=Threonine, Vs=Valine, 
W^Tryptophan, Y=Tyroslne, X:=Unknown, '^stop * 
Codon, /==possible nucleotide deletion, 
\=poesible nucleotide insertion) 








DEAVQArXRAVVRSYDSGSPTIMQEVVSAFLPIX3aDHLRDKIX» 
HQSPATPAFKSLEAFFLYGRLYEFWWSLSRPCPNSSVWVRAGHR 
TLSVEPSQQLDTASTEETDPETSQPEPNRPSELDIiRLTEEGERM 
LSTPKELFSEKHASLQNSQRTVAEVQBTLAEMrRQHQKSQLCKS 
TANGPDKNEPEVEAEQPLCSSQSQCKEEKMEPLSLPELTKRDTE 
ANQRMAKFPESIKAWPFPDVLECCliVIiLLIRSHFPGCLAQEMQQ 
QAQEl4LQKyGNTKTyRRHCQTF€a4 


5838 


110 


98 


ISLDLESSCVTKiCLSPEKEIYEMES\PSGRIWGNVSTITFQYNG 
I/SDNMECKGNLEGQVSKSEGLYMCVKXTCBEfCATESHSTSSTFH 
RII/HYQGKIVKCKECRQGFSYLSCLIQHEENKNI*KCSEVNKH 
RNTFSKKPSyi*HQ\KFRXX3EKPYECMECGKAFGRTSDLIQHQK 
IHTWEKPYQCNACGKAFIRGSQLTEHQRVHTGEKPYDCKKOGKA 
FSYCSQYTLHQRIHSGEKPYECKDCGKAFILGSQLTYHQRIHSG 
EKPYECKECGKAPlLGSHLTyHQRVHTGEKPyiCKBCGKAFLCA 
SQLNEHQRIHTGEKPYECKECXSKTFPRGSQLTYHLRVHSGERPY 
KCKECGKAPISNSNLIQHQRIHTGEKPYKCKEOSKAFICGKQLS 
EHQRIHTGEKPFECKECGKAFlRVAYIiTQHEKlHGEKHYECKEC 
GKTFVRATQLTYHQRIHTGEKPYKCKECDKAF/HLWLTILSEHQ 
RIHRGEKPYECKQCGR/LFIRGSHL/NEHLRTHTGEKPYECKEC 
GRAFSRGSEHTLHQRIHTGBKPYTCVQCGKDPRCPSQLTQHTRli 
HN* E YSSHKI CMHS lALASLDFAHLQEKNPBK 


S839 


1 


2425 


GRPFPRPPRALPRLPLRGRRQDGRWTVDPEECLKD\SPRFRAAL 

EEVEGDVAEIiELKL\DKI*VKLCIA\MIDTGKAFCVANKQP«NGI 

RP\IAQNS\NNDA\WETKFAPSFI.DSLQEMINFHTII./I'*PNS 

EIN*GHSFQNFVKBDLRKFKDAKKQFENSQ*KRKKIAI,VKNAPV 

I'OKifA&uc.lj J!ti't'nXJjTATRKCFRHIALDYVLQINVbQSKRRSE 

lUCSMItSPWYAHIAFFHQGYDLFSBIiGPYMKDLGAQIiDRLVGDA j 

AKEKREMEQKHSTlQQKDFSRDDSICLKyNVDAAKrGlVMEGyLFK 

RASKAFKITOrRRWFSIQNNQVVYQKKPKDMPT^A^^DI*RI^TVK 

HCEDIERRFCFEWSPTKSCMLQADSEKLRQAW1KAVQTSI\AT 

AYRBKDDESEKLDKKSSPSTGSLDSGNESKEKLLKGESAtASRVQ 

CI PGNASCCDCGLADPRWAS INr/3ITLCIECSG XHRSLGVHPSK 

VRSLTljyrWEPELLFO^CELGNDVINRVyEANVEKMGIKKPQPG 

QRQEKEAyiRAKYVERKFVDKIFL*SLSPP\EQQKK\FVSKSSE 

ekrlsiskfgpXgdqvrasaqssvrsndsgiqqssddgrbslps 

TVSAWSLYEPEGKRQDSSMFLDSKHLNPGLQLYRASYEKNLPKM 
AEALAHGADVNVJAWSEENKATPLIQAVI/SaSLVTCEET^LQMGAN 
VNQRDVQGRGPLHHATVLGHTGOVCLPtiKRGANQHATDEBGKDP 
LSIAVEAANADIVTLLRLARMNEEMRESEGLYGQPGDETYQDIF 
RDFSQMASNNPEKLNRPQQDSQKF 


S840 


698 


3610 


KHLHLPRQHl.TTLWQISSPRWRSPQRA?MSALSKTQTQSAPAIiQ 
GLSSIiliQSVTGNPVPASEAASQSTSASPANTTVYTIKGRKLPSS 
AQPFIPKSFNYSPNSSTSEVSSTSASICISIGQSPGLPSTAPKLP 
SNTKGFTATHKTSPAAPPTEVTICQSSEVSKPKL\ESESTSPSL 
\3MKIHNPLKGNPGFSVA*NLKHPNPAGSLGSSAPSESHPSDFQ 
RGPTSTSIDNIDGTPVRDERSGTPTQDBMMDKPTSSSVDTMSLL 
SKIISPQSSTPSSTRSPPPGRDESYPRELSNSVSTYRPreiiGSE 
SPYKQPSDGMERPSSLKDSSQEKFYP0TSPQEDEPYRDPEYSGP 
PPSAMMNLQKKPAKSILKSSKIiSDTTEYQPIliSSYSHRAQEFGV 
KSAFPPSVRALLDSSENCDRIiSSSPGLFGAPSVRGNEPGSDRSP 
SPSKNDSFFTPDSNHNSI*SQSTTGHIiSLPQKQYPDSPHPVPttRS 
LFSPQMTtJUVPTGHPPTSGVEKVLASTISTTSTlEPKNMLKNAS 
RKPSDDKHFGQAPSKGTPSDGVSLSNLTQPSLTATDQQQQBEHY 
RIETRVSSSCU>LPDSTEEKGAPIETLGYHSASNRRMSGEPIQT 
VESIRVPGKGNRGHGREASRVGWFDLSTSGSSFDNGPSSASEIA 
SLGGGGSGGtiTGFKTAPYKERAPQFQESVGSFRSNSFNSTFEHH 
LPPSPLEHGTPFQREPVGPSSAPPVPPKDHGGIFSRDAPTHI.PS 
VDLSWPFTKEAALAHAAPPPPPGEHSGIPFPTPPPPPPPGEHSa 
SGGSGVPFSTPPPPPPPVDHSGWPPPAPPLAEHGVAGAVAVF? 
ECD!ISSr.U3GTLAEttFGVLPGPRDHGGPTQRDUJGPGLSRVRESL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predxcted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptTH^ 
(A^Alanine, C=Cysteine, D»Aspartic Acid, 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I = rsoleucine, K*=Lysine, 
l.=Leucine, M=Methionine, N-Aeparagine, 
P» Proline, Q=Glutamine, R-Arginine, 
S^Serine , T=Threonine , V=Val ine, 
W«Tryptophan, Y=Tyrosine, X^Unknown, *aStop 
Codon, /=possible nucleotide deletion, 
\=ipossible nucleotide insertion) 








TIiPSHSLEHL.GPPHGGGGGGGSNSSSGPPLGPSHRDTlSRSaiI 
LRSPRPDFRPREPFLSRDPFHSLKRPRPPFARGPPFPAPKRPFP 


5841 


1908 


762 


GLRI^FLVLTVWPMMKPSWLSRTEPSKRLLCRTLWCQSGWSSRSY 
TRSMLKMTTSINRRSRTSTKSTRTSARPGLTATVSIGLSDSPTW 
RHCWMTARSCSGEKGGHWAPRQVGVYLLPGRVGCVSSRVSPSFP 
GDGIiDSGr*ARRGSAVSA]uASGI,VESPMIiGPPFHPTPRFKAVSAK 
SKBDLVSQGFTEinriEDFHNTFMDI»IEQVEKQTSVADLLASFND 
QSTSDYI.WYLRLLTSGYLQRESKFFEHFIEGGRTVKEFCQ\QE 
\VEPMCKE$DHIHIIALAQGLQRVHPGWEYMGPRPKAATTNPHI 
FP*GLPSPICVYLLYRPG\HYDILYKIGLGSSPLGCPGCPLLARA 
LGHCYRG FS WVKWS YFTPFFLSHDP PPMF Y 


5842 


307 


1918 


QEPTADFKLRSTCX3CGREMTCPDKPGQliINWFIt5LCVPRVRKL 
WSSRRPRTRRNlJIiLGTACAIYLGFLVSQVGRASLQHGQAAEKGP 
HRSRDTAEPSFPEIPLDGTLAPPESQGNGSTLQPNVVYITLRSK 
RSKPANIRGTVKPKRRKKHAVASAAPGQEALVGPSIiQPQEA\EG 
KLML^HLGTLREQTWLRLESDPGGWCGVRE/WRAGGPDFLQPSS 
RESNIRIYSBSAPSKIjSKDDIRRMRLIiADSAVAGIiRPVSSRSGA 
RLLVLEGGAPGAVLRCGPSPCGLI^KQPLDMSEVFAFHLDRILGL 
NRTIiPSVSRKAEFIQDGRPCPIILWBASI^SASNDTHSSVKLTW 
GTyQQLIiKQKa^QNGRVPKPESGCTEIHHHEWSKMALPDFLLQI 
YNRI^DTNCCGFRPRKEDACVQNGLRPKCDDQGSAALAHI IQRKH 
DPRHLVPIDNKGPFDRSEDNLNPKLLEGIKEFPASAVyvUCSQH 
LRQKLLQSLFJLDKGYMESQGGRQGIEKLIDVIEHRAKILITYIN 
AHGVKVI,PMNE 


5843 


SOO 


1453 


GTARLVTCWVX,HGQ*VICKPAWEPGVVWL*Q*RaiPKGWGLGAGM 
R3SRMSQPPQCl,RRAQSSCCHFMVKI.LDDGTPMIPGEKVAHTSL 
DALVTFHQQKPIEPRRBLLTOPCRQKDPANVDYEDUFUYSKAVA 
EEAACPVSAPEBASPKPVLCHQSKERKPSAEM/RQNNHQGSHPL 
liPPKIPSWRDPPETLEEPQNAPRERPEGPAAAKKPPRHCBLWT 
LGCPEIHGDLRPWDRKRQPRSLRGSHLGGQRLHGSLCGHISQKP 
LTAPGTKRQKGPHQEGREVGQLH*GDPRGQELAPNGSESPII,PG 
VQARAPGLGRA 


5844 


202 


24 71 


FDSAVLSS INVMAVLPGPLQItLGVliLTISXiSS IRLIQAGAYVg I " 

KPLPPQIPPQMPPQIPQYQPLGQQVPHMPLAKDGLAMGKEMPHL 

QYGKEYPHIUPQYMKSIQPAPRMGKEAVPKKGKEI PLASLRGEQG 

PRGEPGPRGPPGPPGLPGHGIPGIKGKPGPQGYPGVGKPGMPGM 

PGKPGAMGMPGAKGEIGQKGEXGPMGIP*PQGPPGPH6LPGIGK 

PGGPGLPGQPGPKGDRGPKGLPGPQGLRGPKGDKOPGMPGAPGV 

KGPPGMHGPPGPVGLPGVGKPGVTGFPGP\QGPLGK\PGAPGEP 

GPQGPIGVPGVQGPPGIPG IGKPGQDG\ IPGQPGPPGGKGEQGI* 

PGLPGPPGLPGIGKPGFPGPKGDRGMGGVPGALGPRGEKGPIGA 

PGIGGPPGEPGLPGIPGPMGPPGAIGFPGPKGEGGIVGPQGPPG 

PKGEPGLQGFPGKPGFLGEVGPPGMRGPPGPIGPKGEHGQKGVP 

GLPGVPGLLGPKGEPGIPGDQGLQGPPGIPGIGGPSGPIGPPGl 

PGPKGEPGLPGPPGFPGIGKPGVAGLHGPPGKPGALGPQOQPGL 

PGP PGPPGP PGP PAVMPPTPP POGKVT.P DMrt r J- rrvWD oci a vr» 

AKKGKNGGPAYEMPAFTAELTAPFPPVGAPVKFNKIiliYNGRQNy 
NPQTGIFTCE\nPGVYYFAYHVHCKGGNVWVALFKNNEPVMrrYD 
EyKKGFLDQASGSAVLLLRPGDRVFLQMPSEQAAGLYAGQYVHS 
SFSGYLLYPM 


5845 


215 


2061 


EiASNKSASI^DKMANPKEKa7^CL\^IARPmVQPQYKLLNER 
GPAHSKMFSVQI,SLGEQTWESEGSSXKKAQQAVGMKALTESTLP 
KPr*KPPKSNVNNNPGCITPTVEIiNGi:AMKRG\KPAIHRPLDPK 
PFPNNRANYNFQVMYNQRYHCP I PKX FYVQLTVGNNEFFGEGKT 
RQAARHNAAMKAU3AtQNEPIPERSPQNGESGKDMDDDKDANKS 
E I SliVFEl ALKRtlMPVSFEVIKESGPPHMKS FVTRVSVGEPSAE 
GEGNSKKI»SKKRAATTVLQELKKLPPLPVVEKPK\HFFKKRPKT 
IVKAGPEYGQGMNPISRIAQIQQAKKEKEPDYVLLSERGMPRR% 
EFVMQVKVGNEVATGTGPinCKXAICKNAAEAMIJ:.Ql^YKMTNl4 
DQLBKTGENKGWSGPKPGFPEPTNNTPKGILHLSPDVYQEMEAS 
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1 SEO 
1 

NO: 


preaxcced 

b^Srinning 

nucleotide 

location 

c or r e spond i ng 

to firat 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 

location 
corresponding 
to first 
amino acid 
residue of 

aiuXilv^ <^CJ.Cl 

sequence 


Amino acxa segment containing signal p^ptlHF" 
(A^Alanxne, C^Cysteine, D=Aspartic AcidT eT 
Glutamic Acid, F=Phenyl alanine, G=.Glycine. 
I[=Histadine. I«Isoleucine, K«Lysine, 
L«Leucine, M-Methionine , N^Asparagine, 
P=*Proline, Q=Glutamine, R^^Arginine 
S=SGrine, T^^Threonine , V=Valine, ' 
W.Tryptophan, Y==ryro6ine, X^Unknown, *:=Stop 
Codon, /^possible nucleotide deletion, 
\«posfiible nucleotide insertionl 


584^ 


1126 




KGTSSTAEAlGLKGSSPTPPCSPVQPSKQLEYUVHIQGPQVinrc 
DRQSGKECVTCXTLAPVQMTFHArGSSlEASHDQV*yATSlLLC 
YGPARKWKAIKMEAMCAHAALI^I^IHYLIAPSTU^BKSKbAllG 


5847 




456 


PSKLIKKTt'x XGXSGVTNSGKTTLAKNLQRH1,PJJCSVISQDDFF ^ 
KPESBIETDKNGFLQYDVLEAI^EKMMSAISCWMESA^ 
TDQr.SAEEIPILIIEGFLLFKYKPLDTIWNRSYFLTIPY3ECKR 
RRSTRVYQPPDSPGYFDGHVWPMYLKYRQEMQDITWEWYUXJT 
KSEEDLFLQVYEDLIQELAKQKCLQVTA*RimTTNPS /CK*rRK 


584B 


2769 
22 


SOS 


APEMEDLSSPD5TLLQGGHNLLSSASFQESVTFKUVIVDFTOEE - 

PWIMEPSlPVGTCADWETRLENSVSAPEPDlSEEELSPEVrVEK 
KKRDDSWSSNLLESWEYEGSLERQQANCXJTLPKEIKVn'EKTIPS 
WEKGPVNNEFGKSVNVSSNLVTQEPSPEETSTKRSIKQNSMPVK 
KEKSCKCWECGKAFSYCSALIRHORTHTGEKPyKCN*/cVEKAP 
SRSENLINHQRIHTGDKPYKCDQCGKGFIEGPSLTOHORIHTGE 

cnecgkafwgpstfirhhmihtgekpySSg^SIS^S 

GKAFSYCSSLTQHRRIHTREKPFECSECGKAFSYLSNLNQHQICr 
HTQEKAYECKECGKAFIRSSStAKHERIHTGEKPYQCHE^KT; 
SYGSSLXQHRKIHTGERPyKCNECGRAFNQNIHLTQHKRIHTGA 
KPYECASCGKAFRHCSSLAQHQKTHTEEKPYQCKKCBKTFSQSS 
HLTQHQRIHTGEKPYKCNECDKAPSRSTHLTQHQRIHTGEKPYK 


S849" 


3S45 


2961 

] 


AAPRRLIiRGGtKJi>KTPHFPLPAt;LRPGPPAEAAPKRRKMPAVSK " ' 

GIXSMRGIAVFlSDIRNaCSKEAElKRlKKEI^IRSKFKGD^^ 

I)GYSKKKYVCKI^FrET,LGHDIDFt3HMEAVNLLSSNRyTEK0lG 

YLFISVLVNSNSELIRLINNAIKMDLASRNPTFMGIAr^HCIASV 

GSREMAEAPAGEIPKVLVAGDTMDSVKQSAALCLLRLYRTSPDI. 

VPMGDWTSRVVHLUTOQHLGVVTAATSLITTLAQKNPEEFICTSV 

SLAVSRI.S\RIVTSASTDLQDYTY*FCPGPLC3LSVKLr,RLLQCy 

PPPDPAVRGRliTECLETILNKAQBpPKSiCKVQHSKAKKAVliEA 

SSEFSHEAVKTHIETVINALKTERDVSVRQRAVDI.LYAMC33RSN 
APQIVAEMLSYLETADYSIREEIVLKVAIIAEKYAVDyTWXyVD 
TltiNLIRIAGDYVSEEVWYRVIQIVINRnDVQGYAAKTVPEAIXJ 
APACHENLVKVGGyiLGEFGNLrAGDPRSSPI.IQFHLLHSKFHL 

CQRAVEYLRLSTVASTDXIATVLESMPPFPERBSSILAKLKKiCK 
GPSTVTDLEDTKRDRSVDVKGGPEPAPASTSAVSTPSPSADLLG 
LGAAPPAPAGPPPSSGGSGLLVnVFSDSASWAPIAPGSEONPA 

NFTPTLlCSDDLQPNLNfr^JTKPVDPTVEGGAQVQQWKrBCVSD 
^^^^S!!^^'^''''^^^^Q^^^T^NKPFQPTEMASQDFFQ 
^n^S^^^^S?^Q^^^^^"°T2VTKAKIIGFGSALLE3VDP 






lass 1 I 

I 
1 

c 

\ 
I 

C 
L 


UlREIKETVFHHVAQAGLELLSSSIjPPSSASRSAGITGMRHQVQ 
^*DPCMSLSPPCFTEEDRPSLEALQTIHKQ«DDDKDGGXEVEES 
)BFrRBDMKYKDATNKHSHLHREDiCHITIEDI,WKRWKTSEVHNW 
^LEDTU3WLIEPVELPQYEKNFRDNNVKGTTLPRrAVHEPSFMI 
QLKISDRSHRQKLQLKALDWLFGPIiTRPPHNWMKDFILWSI 
aGVGGCWFAYTQNKTSKEHVAKMMKDI^LQTAEQSl^LQER 
BKAQEEtniWAVEKQNL*RKMMDEINYAKEEACRLREl,REGAE 
'EI^RRQyAEQELEOVRMALKKABKEFELRSSWSVPDALQKWuf 
THEVEVQYYWIKRONAEMQLAIAKDEAEKIKKKRSTVFGTUIV 
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SEQ 
ID 
NO: 



Predicted 
beginning 
nucleotide 
location 
CO rr e spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotidG 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F=: Phenyl alanine, G=tGlycine 
H-Hiatidine, I = l30leuctne, K^I^ysine, 
L=Leucine, M^Methionine, NsAsparagine, 
P=PrQline, Q=Glut amine, R=:Arginine, 
S^Serine, T=iThreonine , V-Valine, 
W=Tryptophan, Y-Tyroaine, X.Unicnown, *=.stop 
Codon, /-possible nucleotide deletion, 
V^poasible nucleotide inser tion? 
AHSSSLDEVDHKILEAKKALSELTTCUiERijyKWOQIEkiraFQ 
IAHMSGLPSLTSSLYSDHSWVVWPRVSIPPV?IAGGVDDLDBt>T 
PPrVSOFPGTMAKPPGSLARSSSLCRSRRSIVPSSPQPQRAQLA 
PHAPHPSHPRHPHHPQHTPHSLPSPDPDILSVSSCPAIiYRNEEE 
EEAiyFSAEKQMBVPDTASECDSI^SSIGRKQSPP/SKPRDIPN 
IIS/DERYQEMRCP*RIPSGGIIi 



5851 3120" 



5852 



422 



5853 



58SS 



223 



93 8 



536 



2391 



KAVWJFSASysviSLTGSNPMHDASMWHLKKMGIIVYIiDVPLLN" 

LICRLKLMKTDRlVGQWSGTSMKDULiKFRRQYYKKWYDARVFCE 

SGASPEEVADKVLNAIKRYQDVDSETPISTRHVWPEDCEQKVSA 

EFPIEAVIEGIASDGGLFVPAKEFPKIiSCGEWKSLVGATYVERA 

QILLERCIHPADIPAARLGEMIETAYGENFACSKIAPVRHLSGN 

QFILELFHGPTGSFKDLSLQLMPHIFAQCIPPSCNYMILVATSG 

DTGSAVLNGFSRLNKNDKQRIAWAFPPENGVSDFQKAQIIGSQ 

RENGWAVGVESDFDFCQrAlKRIFWDSDFTGFIiTVEYGTILSSA 

KSINWGRLIiPQWYHASAYLDLVSQGFISFGSPVDVCIPTGNFG 

KILAAVYAKMMGrPIRKFICASKQEJHVWTDPlKTGXHYDLRGKE 

K*AQTFFTVQ* I FLPNLSNLERHI^KLMANKDGQLMTKLPNRLES 

QHHFQIEKALVEKLQQDFVADWCSEGECIAAINSTYNTSGYILD 

PHTAVAXWADRVQDKTCPVrrSSTAHYSKPAPAIMQALKIKEI 

NETSSSQLYLl^SYNALPPLHEALLERTKQOEKMBYQVCAADMN 



RCYLQFLAIii.L.rSTSARAAAAIAAAEEP AGSPSVMTRAGDHNRQ 
RGCCGSLADYLTSAKFIiliYLGHSLSTWGDRMWHFAVSVPliVELY 
GNSLUUTAVYGliWAGSVLVLGAI IGDWVDKNARLKVAQTSLW 
QNVSVXIiCGIILMMVPLHKHELLTMYHGWVLTSCYILIlTIANI 
ANIASTATAITIQRDWIVWAGEDRSKLANMNATIRRIDQLTNI 
LAPMAVGQIMTFGSPVIGCGFISGWNLVSMCVEYVlaLWKVYQKT 
PALAVKAGLKEEETEIiKQLNLHKDTEPKPLEGTHLriGVKDSNIH 
ELEHEQEPTCT^QMABPFRTFRDGWVSYYNQPVF /LGWHGSCFP 
LYDCPGL*LHHHRVRLHSGTEWFHPQYFDGSISYNWNKGNCSFY 
tATSKMWFGSDRSDLRIGTAFXiFDLVCPLCIHAWKPPGLVRFSF 



KTTFFiiHWJPLRQLPEVRGYSGQPLTDPLISLCRSHKCRGKGWG 
SSSYPSLPALLRARSAPGHCTHRSCGPEWRIDSISRI^EMQGAHR 
SGWAQAQPTILLLVPRLRKSLPSIWG/SLMaPFITSGPG/WFRQ 
YYPFISGRH*VI*PTBSDPyYVAMDFGGHGL3SHYSPGVPyYLQT 
FVSBIRRWAGKKQSVYFRRCGGCSRAPPIiITGGGVGSRKQRWP 
ESGAWAIAPGLiPAIHGRSWES 



RLLGIiSRVKGLHGPAASAWISDPETRGDPGGPWGMWRGfiDLkPR 
PVSr,TGLTriVCK*AAQGPQV\HSVKLCFGIiGG\PCLL\FPIPRP 
LLLEPRRPRLHPGTRGVAVEPHALRWHVAHGEEAGIRAAGPGH 
GGVEIPQG/VGSLGARRGLRPSRPSSRHRNRVPAPPPGRPLATP 
HRRRFPPDPALTCPGLGQDQGPRECXJKOGSGRHDTIIiGDWGESE 
SRWVRGNFRTGTAATI,IGFSRNPTIiNGSENWGSI.VSIQEEGPDr 
GWERBKRNPAEMGNPQRWASPIHTPPLGPEiriRAMPEALRAMPE 
ALGLRPDPArSVPSALS/QTF/PESWPRSCLRNQGETIO^SPVP 
LSSLCITESPSQMWTPCLLI.LTCPRGLF 



KGRKTAPEKKGAALNNREM A£<J*MGY/SRWKODXRRIENHIXQE 
LKHLCyiMIKRVLLERLBimiKLRELTEGRTLDWPQNRITEVSAK 
RQIVTEYREKGKRN*EEKKRDLEGRSRRYNLCIIGIPETEDRAS 
GABTXKDr.DE/EWFPEX,KNELDLQMEKAHRrPLKFNEKKAASEH 
IRVTFL/KFQRRWILQASSQRKQVTYKGAKVRLTSDFSPAILNA 
RRQW/N/PISRVXRENNFEPRZ rYSAKLSFLYKGNWKTFLDIQG 
j4GKYINQELSLKtLLKDLLQI.TENUf 



LRSYGCKAPSRISHLHKXFLPIjLLPSUUyiGYSESPPPITDSWAP 
PISLTHHVLSQSQSPLSSNCWICLSTHTQ*FTALPADLLTWTQS 
NVSWIISYlAIPFLAi>SPIiKPV/L*PGNSAKHLSFKLSSLSMVS 
GRAVALLHLTASGLTSIOnjJTASSKPPIWGYXr^STQTSFlSPPP 
LCliSRTYPNPAHATMVGQVPQS LCGLIFTL/RTPCRPS XLHPNY 
KI ISTSAWQKVI.CPSGSPTXHTSLHLTTGSSFLSFHPIPGFPAA 
NSALYVSSLKGPPGK^^r^XPSPVTGT*QPPHRGSN/RLTVDIaiK 
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SEC 
ID 
NO: 


predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acia segment containing sign^ peptYg^ 
(A^Alanxne. C-Cystelne, D=Aspartic AcidT E. 
Olutamxc Acid, F« Phenyl alanine, G=.Glycine, 
H=Hiscidi-ne, I=:Xsoleucine K:-T.v<;-i 
L=Leucine, M=Methionine, N-Asparagine, 
P=Proline, Q»Glutainine, R=Arginine, 
S^Serine, T=Threonine, V=Valine 
W^Tryptophan, V==Tyrosine, X-Unknown, *=Stop 
Codon, /=possible nucleotide deletion 
\=spossable nucleotide insertion) 


5856 






ir-lr-I.KPKPMSLHUbPSO\TPYQAbTGAAlAGSYPlWEWEm^^ 
PTPrYNFCbSTPSLFFLCDTN*YLCLPANMqr"rr^T i7Crt?v«m-r« 

ILPPNQTILrsVEASISSSPlRNKWALHLITLLTGLai^GT 

GIAGITTSlTSYQTLFTTbSNTVEDMHTSITSLQRQLDFLVGVI 

I^NWRVt.DLLTTEKGGTCIYLQEECCFCVNESGIVHIAVRRLHn 

RAAEL*HQVADSWWQGSSLLRWIPWVAPFLGPLIFLFLI,LMiGP 

CIFNLVSRFISQRLNCFIQASMQKHIDNIFHLCHV*YQSLRGNH 
SEAPEPRP 


5857 


X73 


1137 


PWLHGLGLSAVFLFYb*/YVn'FHLYGGIILLLLIFISIAGILYk - 

rwuvijLiXFl'is.QPSSSRLYVPMPTarPHENIFlRTKDGlRLNI.TL 

IRYTGDNSPYSPTIIYFHGNAGNIGHRLPNALLMLVNLKVNLLL 

VDYRGYGKSEGEASEEGLYLDSEAVLDYVMTSPDLDKTKIYLSG 

RSIiGXGAAAIHIiASDNSHRISAIMVENTFLSIPHMASTL^SFFP 

mryi.plwcyknkflsyrkisqcrmpslfisglsdqlippWkq 

LYELSPSRTKRLAIFPDGTHNDTWQCQGYFTALEQFIKEWKSH 


5858 ' 


1597 


563 


KlilGKVLVI^ V k/AUAMAAFAVBPQU.PAbOSKi'MMbGSPTSPKPG 
VNAQFr,PGFI,MGDLPAPVTPQPRSISGPSVGVMEMRSPI.IAGGS 
s- r s^tr V V iTi^tuuJ .'UsLaAi' P VRS I YDD I SS PGIiGSTPLTSRROPNIS 

VMQSPLVGVTSTPaTGQSMFSPASlGQPRKTTI.SPAQUOPFYTQ 
GDSX,TSEDH\LDDSWGDCIMGFLKASA\SYlI.b\QFAQYGGIS* 
NMWMSNTGNWMHIRYQSiaQARKALSKDGRIFGESIKIGVKPCI 
DKSVMESSDRCAbSSPSIAPTPPIKTLGTPTQPGSTPRISTMRP 
LATAYKASTSDYQVISDRQTPKKDESLVSmjEYMFGW 


5859 


355 


1419 


PPHQPAAASTSXHQQQQPPPPPQDSSKPWAQGPGPAPGVeSAP 

PASSSAPPATPPTSGAPPGSGPGPTPTPPPAVTSAPPGAPPPTP 

PSSGVPTTPPOAGGPPPPPAAVPGPGPGPKQGPGPGGPKGGKMP 

G3PKPGGGPGLSTP3GHPKPPHRGGGEPRGGRQHHPPYHQQHHQ . 
GPPPGGPGGRSEEKISGPRRGFKANT^«iT r i7DT>r'T»vn*im»rtft^r,r,« 

LLGIYLLISRRMNSRRLFAKIWENQEKFJCSTKAKDSEFIKLESR 

ALA*NCPKFELG*YTP*GGRQLPSSLFPTHACLPLSCSVIPSPP 

MFPQ*NCWGRKPFRPNLGPHLKGAVCNRWDDPWEGPlt3KGHCLM 
PAS 


5660 


307 


1503 


G3SSARPRAijSRRMLSRKKTKNEVSKPAEVQGKyVKKfe.TSPLLR 
NLMPSFIRHGPTlPRRTDlCiiPDSSPNAFSTSGDGWSRNQSFL 
RTPXQRTPHEIHRRSSNRI.SAPSYIARSIADVPREYGSSQSPVT 
EVSFAVENGDSGSRYYYSDNPF£X3QRKRPr/3DRAHEDYRYYEYN 
HDLFQRMPQWQGRHASGIGRVAATSLGWLTNHGSEDI.PLPPGWS 
VDWTMRGRKYYIDHNTNTTHWSHPLEREGJUPPGWERVESSEPGT 
YYVDHTKKBCAQy\RHPCAPTCTSV*sr:Srai/AS/RQQTERWQ 
SLLVPANPYHTAEIPDWLQVYARAPVKYDHILKWELFQLADLDT 
YQQ^^^KLLE^^LEQI\^CMY1EyVYRQAt»LTEDENRKQRCMWYA(V5 


5861 


2956 


1270 ' " 

] 
< 
I 
I 


TIRVEEFPLCPGGGKAQI^SASI,U3AGLl^PPTPPPI.LLtLFP 
Lr.LFSRLCGALAGPI rVEPHVTAVWGKNVSLKOUIEVNETITQI 
SWEKIHGKSSQTVAVHHPQyGFSVQGEYQGRVLFKMYSi;^A'^I 
A iMu.^ e i>i^i3i»K.x X C jKAVTFPLGNAQSSTrVTVLVEPTVSLIKG 
PDSLIDGGMETVAAXCIAATGKPVAHIDWEQDXiGEMESTTTSFP 
NrETATIISOYKLFPTRFARGRRITCWKHPALEKDIRrSPlLDI 
aYAPEVSVTGYDGJ^FVGRKGVNLECCNADAWPPPFKSVWSRLDG 
3WPDGLItASDE^TLHFVHPLTFNYSGVYICKVT\NSPGSKBVTOK 
imPTFQDPSLPTYPPLPALQPQWASPSTA*TSRD\IArEP*KlA 
PSPLSTr.\ATIKGWTQLPTIIA*CSGVGALPIV\LVKCR3LGIp 
inrRRRRTFRGDYFAKNYIPPSDMQKESQIDVLQQDEIiDPYPDSV 
CKENKNPvmblRKDYLEEPEKTQWNNVENLNRF^RPMDYYEDL 
CMGMKFVSDEHYDENEDDI/VSHVDGSVISRREWYV 




2051 


1305 I 
c 

1 


-VCACVQAFWLVASSGDDSaiGDKCGCEVGSWVGSMRVVMARI.L 

;e6eqgi ptacaafaqqpag/eprrglagvgeggpqcswvnyrc 
'leflvsllgtdlargrgnsasgptapadskql/mi,*0vhrrvik 
.e* rmmsgs pardnapsqrpctnlsbglrpgis pswrealygc? 
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SEQ 
ID 
KOx 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
atYiino aci.d 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" 
(A=Alanine, C^Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=.Phenylalanine , G=61ycine, 
H»Histidine, I«Ieoleucine, K^Lyaine, 
L^Leucine, M=Methionine, K^^Asparagine, 
P=Proline, Q=Glutamine, R«JU:ginine, 
S^Serine, T=Threonine, V^^tValine, 
W=Tryptophan, V=Tyrosine, X^Unknown, *=Stop 
Cod on, /=possibIe nucleotide deletion, 
\apossible nucleotide insertion) 
~~A~ ~ — 


SBG2 


15S6 


483 


PPFQLIMGiSlKVSPDYNWFRGTVPLKKIIVDDDDSKIW'siT^AG"' 

PRSIRCPLIFLPPVSGTADVFFRQlLALTGWGYRVIALQYPVyw 

DHLEFCDGFRKLLDHLQLDKVHLFGASLGGFLAQKFAEYTHKSP 

RVHSLILCKSFSDTSIFWQTWTANSFWLMPAFMIJacrVLGNFSS 

GPVDPMMADAIDFMVDRLESLGQSELASRLTLNCQNSYVEPHKI 

RDI PVTIMDVFDQSALSTEAKEEMYKLYPNARRAHLKTGGNPPY 

LCRSAEVNLYVQIHL/R/RNSMEPNTRPI,THQWSVPRSLRCRKA 

ArASARRSSSVSLAVWDBLTRCVLV*SVASAPVSRPFPSGSSGS 
PVLTVSGK 


5863 


2714 




PFPSRGSI*PLAAPREDTMGPLMVIiFCL.IiFLyPGIADSAPSCPQN 

VNISGGTFTLSHGWAPGSLLrySCPQGLYPSPASRLCKSSGQWQ 

TPGATRSLSKAVCKPVRCPAPVSFENGrYTPRLGSYPVGGNVSF 

ECBtX5Fr\LRGSPVRQCRPNGMWDGETAVCDNGAGHCPNPGISIi 

GP\VRTGFRFGHGDKVRYRCSSNLVLTGSSERECQGNGVWSGTE 

PICRQPYSYDFPBDVAPALGTSFSHMLGATNPTQKTKESI.GRKI 

QXQRSGHLNLYLLLDCSQSVSENDFLIFKESASLMVDRIFSFHI 

NVSVAIITFASEPKVLMSVIiNDNSRDMTBVISSIiENTUJJYKDHSN 

GTGTNTYAALNSVYLMMNNQMRLLGMETMAW\QEIRHAIILI.\T 

DGKXSHMGGSPKTAVDHIREILNINQKRNDYLDIYAIGVGKIiDV 

DWRELNSLGSKKDGERHAFILQDTKAl4HQVFEHMliDVSia:.TDTI 

CGVGNMSANASD0ERTPWHVTIKPKSQET\C\RGALISDQWV1,T 

AAHCFRDGNDHSLWRVNVGDPKSQWGICEFLIEKAVISPGFDVFA 

ICKNQGIIi\EFyGD\DIALL\KUVQKVKM\STHC0OPSa[,P\CTM 

\EANLGFLRETFKGSTCR\DHENEI./V\WKQSV\PAHF\VAIj\N 

GSKI*EHLTriRMGVEWTSCCRGJ:iSPKKKTM\FPNIjT\DVRE\VVT 

D\QFL\CS\GPQEDESPNCK*E\SGGA\VKt;EKHKiiIiSAGGVWC 

SHGL\YNP\CT/3SA\DKNSPKKGPSVAKVPPPTR/DPHIN\LFP 

Q*SPMLRQHPGGMS*IFLPLIANGHI.SPFACPARICRPLHFI*PS ^ 
EWATLRTL 


S864 


173 


1013 


PLISVPQSLISLPQPLLCFPGGQEPSAPSPCLYSPLWACSFTMG ' ' 

KLPPSIPPSSPLACVLKNiKPLQLTPDLKPKCLIFFCNTAWPQY 

KLDNDSK* PENGTFBFS ILQVLDNSCHKMGKWSEVPDVQAFF\ S 

HWSltPSLCSQC/GLlPNLSS FS PFCSFG/ PPPQVPS P /TES FFS 

MOSSDLPPSPQAAPRQAEPGPNSHLASAPPPYNPFITSPPHTWS 

SI^PHSVTSPPPPAQQFTLKKVAGAKGZVKVSAPFSLSQIR*RL 

GSFSSNIKIQPSSWLIWQQP 




586S 


568 


1684 


CLPGPRWGEGWRAGHTIVGCIFFKXAI-SHFKGGMYLCVCMCTC 
IiSVCVCVQVGSWlCV/CVSMCACVSLCTCMCRCISMYTREHAC 
ACmV*VlfMCMS/VCTCVSTCIDVRVCAHVCVYMCI.CI<GYA*AC 
TCV*MCVCMHEKVCMC/VCACSCVLL/CRGHICM/MCMSAYICI 
/ CVY VCVLCVWACMRMSTCVWLVYG*ACTCVWMHM/GSCTCR/C 
VHVCCMSMHACBCLCVYLHiCGCAGTRRWWASSARGSRSCSRIiP 
CWAPGPGLSLPGPSCPSVEQGLGGGPGQLQGRSGEARLGEHRGW 
GSPAAVCSRNCTVSPRRGADCF2APDVPKQPPGWGRASFEBRGC 
GGRGWVCAPPIJfGPQCCCFSIKPELKAKKKK 




5866 


98 


3197 


ARPBVPAPPAWLSRRGAAKMGDKKDDKDSPKKNKGKERRDIjDBiT" 
KKEVAMTEHKMSVEEVCRKYWTDCVQGLTHSKAQEILARDGPKA 
LTPPPTTPEWVKPCRQLFGGPSILLMIGAILCFLAYGIQAGTED 
DPSGDNLYLGIVliAAWl ITGCFSYYQEAKSSKIMESPKNMVPQ 
QALVlREGEKMQVNAEEWVGDIiVEIKGGDRVPADIiRIISAHGC 

kvdnssi.tgesepqtrspdcthe\nplktrnitffsnnfveqta 

RGWVATGDRTVMGRlATLASGIiEVGKTPIAIEIEHFIQLITGV* 
AVFIiGVSFFILStilLGYTWLSAVIFrilGIIVAWVPEGLtATVTV 
CliTI^TAKRMARKNCbVKNLEAVETLGS'fSTlCSDKTGTLTQNRM 
TVAHMMFDNQIHEADTTEDQSGTSFDKSSHTWVALF*H/LLGFC 
WRPVFKGGQDNIPVLKRDVAGDASESALLKCIELSSGSVKLMRE 
RNKKVAE I PPNSTNKYQLSIHETEDPNONRYLLVMKGAPERILD 
RCSTrLLQGKEQPIiDEEMKEAFQKAYLEIiGGLGERVLGPCHYYL ' 
PEEQpPKGFAFDCDDVNFTTDNliCFVGLMSMlGPPRAAVPDAvf 
ECCRSAGIKVIMVTGDHPIXAKAIAKGVGI IFBGNETVEDIAARIi 
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NO; 


Predxcted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


«mino acid segment containing signal peptia~ 
(A-Alanine, .C=Cysteine, D=Aspartlc Acid, E= 
Glutamic Acid, F=Phen/lalanine, Gx=Glycine, 
H=Histidine, Ir=Xsoleucine, K^Lysine, 
1.== Leucine, M=Methionine, N-Asparagine, 
P==Proline, Q-Glutamine, R»Arginine, 
S«Serine, T«Threonine, V=Valine, 
W^Tryptophan, Y«Tyrosine, X=Unknown, *^SCop 
Codon, /=po5sibls nucleotide deletion, 
\=possible nucleotide insertion) 


5867 






iU^v^Uvwt^KUAKACVlHGTDLKDFTSKQIDEILQNHTEIVPAR 
TSPQQKLIlVEGCQRQGAIVAVTGDGVNDSPALKKADIGVAMGr 
AGSDVSKQAADMlLLDDNFASrVTGVEEGRLlFDMLKKSIAYTl, 
TSNIPEITPPLLFIMANIPLPLGTITILCIDLGTDMVPAISIAY 
EAAESDIMKRQPRNPRTDKLVNBRIiISMAYGOIGMIQALGGFFS 

YFVILAENGFLPGNLVGrRLNWDDRTVNDLEZ)SyGQQWTYEQRK 
VV^IFTXjHTAFFVSIVWOWADLIICKTEL&Hciuprw^MV'M tw 

GLFEETALAAFLSYCPGMDVAIiRMYPLKPSWWFCAFPYSFLIFV 
YDEIRKLILRRNPGGWVEKETYY 




3 


148S 


lpgrrarggrglgwppaqaldosrmgkakvpaskrapss^vakp 

GPVKTLTRKKNKKKKRFWKSKAREVSKKPASOPGAWRP^KAPE 
DPSQNWKALQEWLLKQKSQAPEKPIiVISQMGSKKKPKIIQQNKK 
BTSPQVKGEEMPAGKDQEASRGSVPSGSKMDRRAPVPRTKASGT 

WFDDVDPADIEAAIGPEAAKIARKQLGQSEGSVSLSLVKEQAPG 

GLTRAIAUJCEMVGVGPXGEESMAARVSIVNQYGKCVYDKYVKP 

TEPVTDYRTAVSGIRPENLKQGEELEWQKEVAEMLKGRILVGH 

ALHNDLKVLFLDHPKKKIRDTQKYKPFKSQVKSGRPSIiRLLSEK 

ILGLQVQOAEflCSIQDAOAAMRLYVMVKKEWESMARDRRPLLTA 

PDHCSDDA*QSCPAAAAAPIaQRQCDQSQGQXTSPQSGNSGETFS 
ESWQRGVAWCY 


5868 
5869 


2122 


833 


i.TAGASHTQOAaQSTSAKVPAAAQNL/CVTNAMREDLAD±WYlR 
AVTVYDKPASFFKETPLDLQHRLPMKLGSMHSPFRARSEPEDPV 
TERSAFTERDAGSGLVTRLRERPALLVSSTSWTEDBDPS lUjAA 
I;ESRV*T\MTU:>GHNI.PSLVCVIT6KGPLREYYSHLIHQKHFQH 
IQVCTPWLEAEDYPIJ^SADLGVCLHTSSSGLDLPMKVVDMFXS 
CCLPVCAVNFKCLHELVKHEENGLVFEDSEEIAAQLQMLFSNFP 
DPAGKLNQPRKNIiRESQQliRWDESWVQTVLPLVMDT 


5870 


2122 


833 


IjTAGASHTQDASQSTSAKYPAAAQML/CVTNAMREDIJ^DIWYIR " ^ 
AVTVYDKPASFFKETPLDLQHRLFMKLGSMHSPFRARJSEPEDPV 

tersapterdagsglvtrlrerpallvsstswtededfsillaa 

I*ESRV*T\MTLDGHNLPSLVCVITGKGPLRByYSRLIHQKHFQH 
IQVCTPWLEAEDYPLLI^SADrXSVCLHTSSSOLDLPMKVVDMP'G 
CCLPVCAVNFKCIJtlBLVKHEENGIiVFEDSEEIAAQriQMLFSNFP 

dpagklnqfrknlresqqlrwdeswvqtvlplvmdt 


5871 


2122 


833 


LTAGASHrQDASQSTSAKYPAAAQNL/CVTWAlvUlEDiAblWYtR " 

avtvydkpasffketpldlqhrlfmklgsmhspfrarsepedpv 

TERSAFTERDAaSGLVTRLRERPAI,I.VSSTSWTEDEDPSILLAA 
I.ESRV*T\MTLDGHNI,PSLVCVITGKGPl,REYySRI,IHQKHPQH 
IQVCTPWLEAEDYPI,LLGSADLaVCI,HTSSSaLDLPMKVVDMFG 
CCLPVCAVNFKCLHELVKHEENGI>VFEDSEEtAAQLQMI,FSNFP 
DPAGKLNQFRKNLRESQQ]:.RWDESWVQTVLPI,VMDT 




3 


3465 

] 
I 
1 
J 
I 
I 
f 
C 


fffcrplrlyskttgdrsamagaagltajsvswkvlerrArtkrs 

VLKLL*I,SLRRL*LEPTI*NGLLT*CSRLSVFRFl,KV\GSVyEP 
LKSINLPRPDNETLWDKLDHYYRIVKSTLLLYQSPTTGI,FPT1CT 
aSGDQKAKIQDSLYOVAGAWAlJUAYRRIDDDKGRTHELEHSAI 
KCMRGILYCYMRQADKVQQPKQDPRPTTCLHSVFNVHTGDELLS 
xiii-xvjrt±jyi.i»AV5DYJjIjYLVEMISSGI*QIIYNTDBVSFIQNLVF 
CV\ERVYRVP\DFG\VWGKREGKYY*/SGSTELHSSSVG]UGKRQ 
L^KQFNGFNLFGNQGCSWSVIFVDLDAmiRNRQTLCSLLPRESR 
SHNTClAALLPCISYPAFALDDEVLFSQTLDiCVVRKLKGKYGFKR 

flrdgyrtsledpnrcyykpaeiklfdgiecefpipplymmidg 

/FRGNPKQVQEYQDLLTPVLHHTTEGYPVVPKYYYVPADPVEYE 
<iINPGSQKRFPSNCGRIX3KLFLWGQALYIIAKLLADELXSPKDI 
5PVQRYVPLKDQRWVSMRPSNQGPriENDLWHV7VLlAESQRLQV 
^UJTYGIQTQTPOQT^PIQIWPQQELVKAYLQLGINEKLGLSGR 
>DRPIGCLGTSKIYRILGKTWCYPIIFDLSDFYMSQDVFr*LID 
)IKNALQFIKQYWKMHGRPLFLVLIREDNIRGSRFNPILDMIAA 
.KKGriGGVKVHVDRIiQTLISGAWEQLDFLRISDTEELPEFKa 
^EEr£PPKHSlCVKRQSSTPSAPEIA2QPDVNISEWKDKPTHElJ 
>KLNr>CSCIiASQAII.LGIU:,KREGPNFITKEGTVSDHIERVYRR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide * 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peotide ~ 
{A=Alanine, C=Cy3teine, D=Aspartic Acid', E== 
Glutamic Acid, F=PhGnyl alanine, G=Glycine, 
H=Hiscldine* I==IsoIeucine, K=Lysine, 
L- Leucine, M^Methionine, N-Asparagine, 
P=:Proline, Q=Glutamine, R-tArginine, 
S=Serine, T=Threonine, Vr^Valine, 
WoTryptophan, Y«Tyrosine, X=Un]cnown, *=Stop* 
Codon, /=»po3sible nucleotide deletion, 
\=possible nucleotide insertion) 








AGSQK1.WS VVIUIAASLLS KWDSiiAPS ITNfVLVQGKQVTI^AFG^ 

KEEEVISNPl.SPRVIQririYYKCMTHDEREAVrQQELVIHIGt*I 

ISNNPELFSGTLIQRIGWIIHAMEYELQIRGGDKPALDLYQLSP 

SEVKQLLIiDILQPCQNGRCWLNRRQIDGSLNRTPTGFYDRVWQI 

LERTPNGIIVAGKHLPQQPTLSDMTMYEMNPSLLVEDTLGNIDQ 

PQYRQrVVEI,IiMWSIVLERNPEIiEFQDKVDLDRLVKEAFNEFQ 

KDQSRLICEIEKQDO^^^SFYN^PPIiGKRGTCSYIiTKAVMNI.I:J:.EG 

EVKPNNDDPCJUIS 


5872 


68 


665 


VQGYMYRFVI KINSCYSEKTS I CRHRCCPELPATQPWPTPTVFF 
NIAID3 ESLGCI \SFKLFADKV/PKRWKKNFVLt*NTGEKVLGDK 
GPCPYRI IPG \LCQGGDFTHHNGTGGKSLYSKEFDDENFI /itKK 
TAPGVLSTANAGPTTNaSQPFICTAKTEDG*QHVVFGKVKDGMS 
IVEALERSGSRNGFCTSKKITAANCX3QL 


5873 


2240 


506 


rrppeggsgggrrtrarmplpwslalplllswvaggfgnaa^Ar 
hhgliasarqpgvchygtkiaccygwrrnskgvceatcepgckp 
gecvgpnkcrcppgytgktcsqdvnecgmkprpcqhrcvnthgs 
ykcfclsghmlmpdatcvnsrtcamincqyscedteegpqclicp 
ssgr.rlapngrdcxididecasgkvxcpynrrcvntfgsyyckch 
igfelqyisgrydcidinectmdshtcshhancpmtqgsfkckc 

KQGYKGKGLRCSAIPEa^SVKEVLRAPGTlKDRIKKLIAHKNSMK 
KKAK1KNVTPEPTRTPTPKVNLQPFNYEEIVSRGGNSHGG\KKG 
NEEKMKEGLEDEKREEKALKD*HRRERPFRG\DVFFPKVNEAGE 
FGLIIi\VQRKAIiTSKIjEHKADI^ISVDCSFNHG\ICDW\KQDR\ 
EDDFDW\NPADR\DNAl\GPY\MAVP6I*WQGHK\3CDIGRliKLLI* 
PDLQPQSNFCLLFDYRIAGDKVGKLRVFVKNSNNALAWEKTTSE 
DEKKKTGKIQLYQGTDATKS 1 1 FEAERGKGKTGEIAVDGVLI,VS 
GLCPDSLLSVDD 


5874 


2 


3387 


ACPRIiARRRRRVRSUiRRRGWLRARWSRGQNKMAARRITQETFD 
AVLQEKAKRYHMDASGEAVSETLQFKAQDtiLRAVPRSRAEMYDD 
VHSDGRYSLSGSyAHSRDAGRESLRSD VFSGPSFRSSNPS I SDD 
SYFRKECGRDIjEFSHSNSRDQVIGHRKLGHFRSQDWKFALRGSW 

eqdfghpvsqesswsqeysfgpsavjugdfgssrliekecxekex 
srdydvdhsg\ea\dsvlrgs\sqvqa\rgralnivdqegsi,lg 
. kgetqglltakggvgkijwlrwvstkkiptvnritpktqqtnqr 
qkktpspdvtlgtnpgtediqfpxqkiplgudlknlrlprrkms 

FDIIDKSDVFSRFGIEIIKWAGFHTIKDDIKFSQI.FQTLFE1*ET 
ETCAKMIJ^FKCSXiKPEHRDFCFFTIKFIiKHSALKTPHVDNEFL 
NMLIiDKGAVKTKWCFFEIlKPFDKYIMRIiQDRLLKSVTPLLMAC 
NAYELSVKMKTLSNPLOLAIjaiErTNSLCRKSIALLGOTFSlAS 
S FRQEKIL* AVGLQDIAPS PAAFPKFEDSTLFGREYIDHLKAWL 
VSSGCPLQVKKAEPEPMRESEKMIPPTKPEIQAKAPSSLSOAVP 
QRADHR WGTIDQLVKRVI EGSLS PKERTLLKEDPAYWPLSDEN 
SLEYKYYKLKI^EMQRMSENIJlGADQKPTSADCAVRW'iliYSRAV 
RNLKKKLLP\WQRRGI*LRAQG\LRG\WKARRA\TTGTQTLi:iFLR 
APGLKHHGRQAPGLS\QAKPSLPDRND\AAKD\CPU3PV\aPSP 
QDPSLEASGPS PKPAGVDISEAPQTSS PCPSADIDMKDNGRTAE 
zujmcc v^sjvKs\kfe.xetKlv\i»i \ENSTDNPDIjWFL\HDQNrSS\APK 
FY\RKKVPELCPSICFTSSPHin4\HTGGGDTT\GSQESPVDLME 
GEAEFEDEPPPREAELESPEVMPEEEDEDDEDGGEEAPA\PGRG 
GPSLEGSTPADGLPGEA\AEDDL/AIiGAPALFTGLLQVTCFPFG 
RGFSSKSL,KVGMIPAPKRVCt,IQEPKVHEPVRIAYDRPRGRPKS 
KKKKPKDI,DFAQQKI,\TDK\NLGFQ\MLOKMGWKBGHGLGSLGK 
GIR\SRSACTQQAAWGGSGWGI.SPSTCSLPU3SPTAKMAYSWQL 
IFVF 


5875 


296 


1846 


lAALGGIiPLWRLSRRGFREYLLGLSAPSALGGAMRSVSYVQRVA 
LBFSGSLFPHAICMbVDNDTIiNELWGDTSGKVSVyKNDDSRP 
WLTCSCQGMLTCVGVGDVCNKGKNLLVAVSAEGWFHLFDLTPAK 
VIiDASGHHETIiIGEEQRPVFKQH I PANTKVMLI SDIDGDGCREL 
WGYTDRVVR7VFRWEELGEGPBHLTGQr*VSI*KKWMLEGQVDSLS 
VTLGPLGIjPELMVSQPGCAYAILLCTWKKDTGSPPASEGPTDdl 
/SGDPSCPRRGAAPDIWPYPQQECLHSPNWQHQT\SHGTESSGS 
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SEQ 
ID 

NO: 


Predacted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
correspondijjg 
to first 
amino acid 
residue of 
amino acid 
sequence 


j-uuj-iik^ oi^^x^a Ejeyineiic concaxning Signal peptide 
(A=Alanine, C=Cysteine, D^^Aspartic Acid, E= 
Glutamic Acid, r=Phenylalanine, G«Glycine, 
H=Histidine, I-Isoleucine, K=Lysine, 
Lt=Leucine, M-Methionine, N:sAsparagine , 
P=:Proline, Q=Glut amine, R^Arginine, 
S^Serine , T-Threonine , V^Val ine , 
W= Tryptophan, Y=Tvrosine X=Unicnr^ttf« * q*.^*- 
Codon, /-posQible nucleotide deletion, 
\==possible nucleotide insertion) 








GLFALCTLDGTbKLMEEWEEADKLLWSVQVDHQLFALEKrjDVTG"" 
NGHEEWACAWDGQTYX X DHNRTWRFQVDENlRAFCAGLyACK 
BGRNSPCLVYVTPNQKI YVYWEVQLERMESTMLVKLLETKPVST 

WTCLrAGEGFF*TPTLPPKGVFGSHC3^GSITKQ 


SB76 


1122 


224 


ilLPLGVPS KVAGAAAME PQEERETQVAAWLKKIR3DHPI ^QYE V 
KPRTTEICHHLSERNRVRDRDVYLVIEDLKQKASEYESEAKYLQ 
CLLMESVNFSPAKLSSTGSRYLNALVDSAVALETKDTSLASFIP 
AVNDLTSDLPRTKSKSEEIKIELEKLEKNLTATLVLEKCLQEDV 
KKAE1.HLSTER\AKVDNRRQNM\DPLKAKSEEPRFGIQAAGBQL 
SARGQ\DAFSVPIQSLVALIRENWPRLKQQTIPLK\KKIiESyLD 
LMP\HPSHCSK*RIEEAK\REIA\SIEAELTRRVS\MMBIi 


5877 


2030 


1007 


GTLGKMAASSSGEKEKERLGGGIXSVAGGNSTREKLIiSAIiEDLEV 
LSRELIEMLAISRNQKIiLQAGBENQVIjELLIHRDGEFQELMKLA 
LNQGKIEWEMQVLEKEVEKRDSDIQQLQKQLKEAEQILATAVYQ 
AKEKLKSIEKARKGAISSEEIIKYAHRISASKAVCAPLTMVPGD 
PRRPYPTDLEMRSGLLGQMNMPSTNGVNGHLPGDAIA/RRKIAR 
CPCSTVS/NGSQMTCR+ INI ILILQKSVCEI, 


5878 


950 


2113 


GLWKCMQI^GPHTHRVQP*PTPHQQGPQ\VPVAVIAGNRPNYLY 
RMLRSLLSAQGVSPQMITVFIDGYYEEPMDWALFGLRGIOHTP 
IS IKNARVSQHYKASLTATFNIiFPEAKFAWLEEDX*DIAVDFFS 
FLSQSrHIiLEEDDSLYCISAWNDQGYEHTAEDPAr*LyRVETMPG 
IX3M\a,RRSIjYKEEliEPKWPTPEKLWDWDMWMRMPEQRRGRECIX 
PDVSRSYHFGTVGLKblKGYPHEAYFKKHKPNTVPGVQIiRNVDSI, 
KKEAYEVEVHRLLSEAEVLDHSKNPCEDSFLPDTEGHTYVAFIR 
MEKDDDFTTWTQIAKCLHIWDLDVRGNHRGLV^RLPRKKNHFLW 
GVPASPYSVKKPPSVTPIFIiEPPPKEEGAPGAPEQT 


5879 


3 


981 


RLTEAAAAGSGSRAAGWAGSPPTLLPLSPTSPRCAATWASSDED 

GTNGGASEAGEDREAPGKRRRtiGFIATAWLTFYDrAMTAGWI,VL ^ 

AIAMVRFYMEKGTHRGIiYKSIQKTI>KFFC?TPALLEIVHCI,IGIV 
PTSVlVTCJVnV52SP T1?MX7lJT.TnnLiOTvr»Tr\xior«mrtTr *w «*« 
Afcj vo. ¥ iww\^voo«.j.rrJv wjjxxHolKPiyNEESVVXjFItVAWTVT 

ElTRYSPyTFSI.LDHr*PYFIKWARYNPPIILYPVGVAGEi:»LTIY 
AAIiPHVKiaX3MFSIia.PNKYNVSFDYYYFLLITMASYIPI,FPQI* 
YFHMLRQRRKVI,HG\G*r.*KRMIK*SLQTRCPFQNNQDYLSPSF 
NNKNKQLCEISWIVWFLKI 


5880 


1.13 8 


1324 


SliWCLVAGGLGIiGPSSQNPI^QRAGIIARPREARGTFSALTACSA 
SVTSKGKSSSGMWPSAASDRDSPVPLRPPGPVQLPSGTGWVI.SD 
*KKKRGRCSS/WLSQPQttEREKEVVLLRRSMAEGERARAASDVL 
CRSLANETHQLRRTLTATAHMCQHIAKGLDERQHAQRNVGERSP 
DQSEHTDGHTSVQSVlEKr^BEim.LKQKVTHVEDLNAKKQRYN 
ASRDEYVRGtaAQLRGLQIPHEPBLMRKEISRLNRQIiEEKINDC 
ASVKQELAASRTARDAALERVQMLEQQILAYKDDFMSERADRER 
AQSRIQELEEKVASLLHQVSWRQDSRBPDAGRIHAGSKTAKYLA 
ADALELMVPGGWRPGTGSQQPEPPABGGHPGAAQRGQGDLQCPH 
CLQCFSDEQGEEIiLRHVAECCG 


5881 
S882 


26 


441 


GGIHPSPTEAPRAQHLTMDCTWRILFIiVAAATGTHAQVQLLQSG 

SEVKKPGASVMVSCYVSGYTLTKLSMHWVRQAPGKGbB*MGPFD 

LQDVETIYPQKFQGRVSMTEETSTETTQ/AYLELSSLRSEDTAV 
HHCATDTV 




2407 


2216 

■ 

: 

I 


SGCVEMLYSHSI.EyKPEWISVOSAVAPAQLALNSDGDi,*LHSGE 
RTRRD*QLPSAGGPGLQEPLQLGELDITSDEFILDEVDG\VDLR 
EiYSKQVELELQQlEQKSIRDYIQESENXASLHNQlTACDAVr^R 
MEQMLGAFQSDLSSISSEIRTLQEQSGAMNIRLRNRQAVRGKLG 
ELVDGLWPSALVTAILEAPVTEPRFLEQLQELDAKAAAVREQE 
^RGTAACADVRGVLDRLRVKAVTKIREFIJjQKIYSFRKPMTNyQ 
I PQTALLKYRFFYQFLUSNERATAKEIRDEYVETLSKI YtiSTYR 
3YIiGRLMKVQYEEVAEKDDr*MGVEDTAKKGFFSKPSIJlSRpmF 
rLGTRGS VISPTELEAP I LVPHTAQRGEQRYPFEALFRSQH YAL 
jDNSCREYbFICEFFWSGPAAHDLFHAVMGRTLSMTLKHLDS^ 
.4AIX:YDAIAVFI^IHIVLRFRNIAAKRDVPArj3RYWEQVlAI,Ld^ 
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SEQ 
ID 
NO: 


Predicteci 

beginning 

nucleotide 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


frcaicted end 
nucleotide 
location 
cor re spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


\^ rtxctnine, ^--t-Vstexne, Di=Aspartic Acirt c 

H-Histidme. I-Isoleucine, K-Lysine 
L^Leucine, M=Methionine, N^^Aaparagiie, 
P=Prolxne. Q=Glutamine, R-Arginine, 
S-Serine, T=Threonine, V=Valine 

Codon, /=possible nucleotide deletion 
\=possible nucleotide insertion) 


5883 


2 




PRFELlLEMNVUSVRSTDP^jkLGGLDTRPH-nTHkifAEPSSAT 


5884 


4261 


1374 


atehesdiaslqedlcrmqneledmerirgSxemeia^eme 

JbWISALLWCWWAETSS | 


S88S 


900 




«?r^^^^f™!^^°''^°""^*°^^SADEr.VTRIHICVRQM 
HI.LXSETSVANGSQSESSVSTPSASBTEPNWTCENSQSHMAELCE 
IPSTSDTKSDTATGGESAGHATSSQEPSOCSDQRP^DW^E 
RLTKKr,EERREEKRKEEEeRElKKEIERRKTCKEMLDYra^E 
KLTKRMLEERHREKAEDRAARERIKQQIALDRA^ARR^S 

TNQFPSDAPLEEARQFAACTVGMWGNPSLAra^R^l^^Y 
KKKLLDLEiAPSASVVLLP/ALFINF.AG^^SSI^^^ 
=ci:!^''^'^^"^^"'"^^SHPPPTQTSVRVTSSEPPKPM K 


5886 " 


86 


467 




5887 


1537 


1341 


f t-HGRALTLKKUPlmjVAPPamTCHKSDPGtU'AAUaof PSPflS 

GTrGH,SFRMVRTKTWTI,KKHFVGYPTNSDPBLKTSEI,PPI.lQIG 

EVI,I.EAI,FLTVDPYMHVAAKRLKBGDTI*1GQQVAKWESKtIVAL 

PKGTIVIASPGWWHSISMKDLEKLLTBWPDTIK^SLAMT^ 

MPGLTAYPGLLEICGVKGGETVMVNAAAaAVGSVVGQlAKLKGC 

KWGAVGSDEKVAYLQKI/SFDWFNYKTVESLBETUCKASpSy 

«™S^°^^^'''^^^°<»"'«^«IAICGAlSTYimTGPI,PPGP 

PPEIGIYQEIJJMEAFVVYHWQGDARQKRDia)LLKWVLEI,PYFV 

D*LQANTI,VYKSMKSAKPSI,EYISEM,VSG\KlQYKEyiIEGPE 
NMPAAFMGMLKGDNLGiCriVKA X'^^UIW.XJ.XEGFE 






104 

J 

^ 
1 
I 
I 

£ 
I 

K 
D 
S 


R^RmSSmprSSfSr^'^''^^^^'^'*^'^^^ 
RPDRCHPGODDRGPQLHRGSPG/SPSELSRRPGPPGLPGLQGPP 

PAPGI.PQSRTL/PVLCVCI)MPAQCDIWCCa>PDCSSVt)FSVPS 

■CIHITN\*NUlYPI.LIQKyL/NENNFDTLMfCrSDGFTLNAESY 
reFTTKLDIPTAAKrEYGVPLQTSDSPLRPPSSLTSSLCTDHNP 
iAFLVNQAVKCTRKlNLEQCEEIEALSMAFYSSPBILRVPDSRK 
CVPITVQSIVIQSLNICTLTRREDTDVLQPTI,VIIAGHFSLCVNW 
.EVKYSLTYTDAGEVTKADLSFVLGWSSVWI^JQKFEIHFLO 
•NTQPVPLSGNPGYVVGLPLAAGFQPHia3SGriQTTORYGQI,Tl 
.HSTrEQDCLAI^GVRTPVLFGYTMQSGCKLRnTGALPCQLVAQ 
VKSI,r*'GQGFPDYVAPFGIISQGP/ADMI,DWVPIHFrTQSEOTaC 
SCQLPGALVIEVKVfTKYaSI.LNPQAKIVlIVTAlJI,ISSSFPEaN 
GNERTIL1STAVTFVDVSAPAEAGFRAPPAINARI.PPNPFFP| 
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VMSDOCID <WO 015331?A1 I > 



wo 01/53312 



PCT/USOO/34263 



ID 
NO; 



5886 



"5889" 



5890 



5891 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



375 



2302 



731 



Amino acid segment containing signal peptide 
(A=Alanine. C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=HLStidine, I = Isoleucine, K^Iiysine, 
Ls=Leucine, M=Methionine, N^Asparagine , 
P=Proline, Q=Glutamine, R^^Arginine, 
S-Serine, T=iThreonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Ualcnown» *=Stop 
Codon, /=possible nucleotide deletion, ' 



1322 



1322 



200 



\=!p03slble nucleotide insertion) 
LLCRTPGVA^JgkADSHyPSKRPRCD^SPRTPSNTPSAEADWSl^ 
LELHPDYKTWGPEQVCSFLRRGGFEEPVLI^KKIRENEITGADLP 
CLDESRFEWLGVSSI/SERKKLLSYIQRIiVQiHVDTMKVINDPlH 
GHIELEIPLLVRIIDTPQFQRLRYIKQLGGGYyVFPGASHNRFEH 
SIiGVGYIAOCLVHAI/3EKQPEI^ISERDVI.CVQrAGLCHDLGHG 
PFSHMFDGRFI PLARPE VKWTHEQGSVMMFEHblNSNG 1 KPVME 
QYGLIPEEDICFIKEQIVGPLESPVEDSLWPYKGRPENKSFLYE 
IVSNKRNGlDVPKMJYPARDCJlHLGrQNNFDYraiPIKFiUlVC^ 
DNELRICARDKEVGNLYDMPHTRMSLHRRAYQHKVGNIIDTMIT 
DAFl.KADDyiErrGAGGiaCYRISTAXDDMBAyTKLTDNIFr.ElL 
YSTDPKLKDAREXLKQIEYRNLFKYVGETQPTGQIKIKREDYES 
LPKEVASAKPKVLLDVKLKAEDFIVTVINMDYGMQBKNPIDHVS 
FYCKTAPNRAIRITKNQVSQLLP\EKFAEQ\LIRVYCKICVDRKS 
LyA\ARQYFVQW\CADR\NFT\KPQDGRC:Y*PPTP*HPQKK:GW\ 
NDSTFS PKI PTRLPRRLPKSRV\Q LFKDDPM 

LPAACGRPVTARPRQAPEGRSGRPRPL DPYPPQVFPPRPDRVAl 
VTGGTDGIGYSTAKHLARLGMHVIIAGNNDSKAKQWSKIKEET 
LNDKET*VLLCCPGWLCLWNSSDPPTSASRGAGTTGVHHHFIiLK 
FGI PIL\DLASrrrS IRQFVQKFKMKKIPLHVLINWAGVMMVPQR 
KTRDGFEBHFGLNYWSHFIiLTNLLLDTIiKESGSPGHSARWTVS 
SATHYVAELNMDDLQSSAOfSPHAAYAQSKIiALVLFrYHLQRlur* 
AAEGSHVTANVVDPG\mrrDLYKHVPWATRLAKKlJjGWLLFKTP 
DEGAWTSIYAAVTPELEGVGGRYLYNKKETKSLHVTYNQKliQQQ 
LWSKSCEMTGVLDVTL 

FRRGWSAAGRAVPVAFCSRISASSPRRPRGAVRLQSGTEAACRS" 
GRPDPRPASAAGGHAGERMSQRDTIjVHLFAGGCGGTVGAILTCP 
LEWKTRU3SSSVTLYISEVQLNTMAGASVNKVVSPGPLHCLKV 
ILEKEGPRSLFRGLGPNLVGVAPSRAIYFAAYSNCKEKLNDVFD 
PDSTQVHMISAAMAGFTAITATNPIWLIKTRLQI.* /SQGTAGKR 
RMGAPECVRKVYQTDGIiKGFYRGMSAS YAGISETVXHFVI YES I 
KQKLLE YKTASTMENDEES VKEASDPVGMMIiAAATS K\ LVATTI 

ayphevvrtri.reegtkyrsffqtlsllvqebqygslyrgi*tth 

LVRQX P\NTAIMMATYEIiWYLliNG 

frrgwsaagravpvafcsrisassprrprgavrlqsgteaacrs" 
grpdprpas aagghagermsqrdxlvhlfaggcx3gtvgai i .tcp 
lewktrlqsssvtlyisevqlntmagasvnrwspgplhclkv 
ilekegprslfrglgpnlvgvapsraiypaaysnckeklndvfd 

PDSTQVHMISAAMAGFTAITATNPIWLIKTRLQI,* /SQGTAGKR 

rmgafecvrkvyqtdglkgfyrgmsasyagxsetvihfvxybsi 
kqklleyktastmendeesvkeasdfvgmmlaaatskxlvattx 

AYPHEWRTRIiREEGTKYRSFFQTLSLLVQEEGYGSLYRGLTrH 
LVRQX P \NTAIMMAT YELWYLLNG 



379 



wlrvcgrlsvnsavssrtggwsagltcamqrlqwlghlrgpX" 

dsgwmpqaapclsgaphasaadwwhgrrtaicragrggfkdt 

tpdellsavmtavlkdvnlrpeqlgdicvgkvlqpgagaimari 

aqplsdipetvplstvnrqcssglqavasxaggirngsydigma 

cgvesmsladrgnpgnitsrlmekekardcltpmgitsenvj\er 

fgisrekqdtfalasqokaaraqskgcfqaeivpvtttvhddkg 

tkrsitvtqdegirpsttmeglakiikpafickdgsttagkssqvs 

DGAAAILLARRSKAEELGLPILGVIiRSYAWGVPPDIMGIGPAY 

ax pvalqkagltvsdvd i peine \apasqaaycveklrlpp * eg 
*tplggasgp*ghplglhwghvqvitlaq*s*sargkrayrsgc 

PCAXGSWMGS PLPVFE YPWGT 



xlskrrcqkaktkelmakkvavigagvsglxslkccvdeglept" 
cfertediggvwrpkenvedgrasiyqswtntskemscfsdfp 
mpedfpwflhnskliieyfrifakkfdllkyiqfqttvlsvrkcp 
dfsssgqwkwtqsngkeqs avfdavmvcsghhinphi p1.ks fp 

GMERPKGQYFHSRQYKHPDGFEGKRILVXGMGNLGSDIAVEIiSK 
NAAQVFISTRHGTWVMSRrSEDGYPWDSVFHTRPRSMLRNVLP% 
TAVKVWIEQQMKRWFNHEMYGLEPQNKYIMKEPVU^DDVPSRLiy 
CGAIKVKSTVKELTETSAXFEDGTVEENIDVIIFATGYSFSFPF 
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WSDOCID: <WO 0 15331 2A1J_> 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
ires i due of 
amino acid 
3e<juence 


Amino acxa segment containing sigmal pepti"ae^ 
(A=Alanxne C^Cysteine, D=Aspartlc Acid! E= 
Glutamic Acid, F=Phenylalanine. G-GlyciAe 
H-Histidine, I-Isoieucine. K=Lysine ^ ' 
L^Letjcine, M=Methionine, W=AsparagiAe 
P=Proline, Q«Glutamine, R=Arginine, ' 
S -Serine, T^^Threonine , V«Valine 
W^Tryptophan, Y^Tyrosine, X-Unknown, *«stoD^ 
Codon, /=possible nucleotide deletion 
N^possible nucleotide insertion) 


S894 






LhlJSLVKVEN^^WSLYlCYIFPAHLDKSTI5raG^ 


58^5 


174 
2967 


1673 


^n^^S^"^^^'^'^^^^^^^^^EQFLIILPKE3X5ARVQEH 

KGVQECX3VRHECEVTKPEKEKGEETRIENGKl.rVVTDSCGRVES 
SGKISEPMEAHNEGSMLERHQAKPKEKIEYKCSEREQRFIQ^ 
LIEHASTHTGKKLCESDVCQSSSLTGHKKVLS*BRKVIQC\HGV 
ifj^^S^^^^^^QKIHLGEKPYQCNECGKVFSQNAGLliHLR 
IHTGEKPYI.ClHCGKWFRRSSHLNRHQRJHSQEEPCECKECaKT 
FSQALLLTHHQRIHSHSKSHQCNECGKAPSLTSDLIRHHRIHTG 


S89€r " 


29^7 


86 «t^ai.i.oAIPFYPPPSSPMPPPLVLFWKSHRfCSRHFINQRGIHGE 

EATEl.QPTI.SAALyn.\VVQGKKG\EDVLGSVRRTLTHlDHSLS 
RO\NCPFIAGETESIa?U3IVLWGALYPI,U5DPArLPEELSALHS^ 

BGKGLSPIEPE3EEIATI^EBElAMAVTAWBKKI,ESLPPLRPOO 

■ NPVI.PVAGERNrVLlTSALPyV^WPHI>GNIlGCVLSADVP^^ 

RLROWNTLYLCGTOEYGTATETKALXEEGLTPQEICDKYHIIHA 

Diy\RWFNISPDlH^RTrrPQQ\TKIT\QDIFQQr^KRGPVLQD 

TVEQLRCBHCARFXrJ^UJRFVEGVCPFCQYEEARODQCDKCG^I 

NAVELKKPQCKVCRSCPWQSSQHLFLDX.PKI.EKRLEEW1^RTL ''^ 

PGSPl^PNAQFITPFFGFREWPSKPRWQ*TRDLK\WGNP6TP*E 

GFEDK\VFYVWFDATIGYLS1TANYTDQWBR^W\KNPEQVDLYQ 

FM\AKI)JA/PFHSLVFPS5Arx?AEDWYTL\VSHLIATEYLMYEDG 

K\FSKSRGVGVPRDM\AHDTGIPPDISRFYL\LYIRPEGK\DSA 

PSWTDLLLKNNS\ELLNNLGNFINRA\GMPVSKFFGG\yvPEMV 

I.TPDDQRLLA\lIVTLELQimrQ\LI,EKVRIRDALRSILTlS\RH 

GNQyi\QVNEPW\KRIKGSEADRQRAX3TVTGLAVNIAALLSVML 

QPyMPTVSATrQAQLQLPPPACSILI.TNFl.CTIaPAGHQIGTVSP 

LFQKLENDQIESLRQRFGGGQAKTSPKPAWETVTTAKPQQIQA 

LMDEVTKQGtriVRELKAQKADKNEVAASVAKI»LDLKKQIAVAEG 
KPPEAPKGKKKX 






**** JJ^fi'^^^P^^i'i't'SSPWPPPr^YLFWNSHRKSRHFINQRGIHGE 
MRLFVSlXtVPGCLPVIAAAGRARGRAEVLISTVGPEDCVVPFLT 
RPKVPVLQLDSGNYLFSTSAICRYFF\IJ[^GWEQDDLTNQWLBW 
EATELQPTLSAALYyL\VVQGKKG\EDVLoSVRRTr,THIDHSLS 
SS:!^5^^^^^^^^^^"<^^YP^^DPAYLPEELSALHSW 
FQTLSTQ\EPCQR\AARRLVI,KQ\QGVIAI,R\PYLQKQPQPSPA 
L^^^^f^^^^^^^^^^^^^^^^^^^^AWEKGLESLPPLRPQQ 
NPVLPVAGERNVI.ITSALPyVNtn;PHI^IIGCVLSADW^ 
RLRQWNTLYLCGTDEYGTATETKALXEEGLTPQEICDK^ 
™^!^^^^^^f^^'^^^^\'f^^T\QDIFQQLLKRGFVLQD 
TVEQI,RCEHCARP\LADRFVEGVCPFCGYEEARGDQCDKCGKr.I 
NAVELKKPQCKVCRSCPWQSSQHLFIJJLPKLEKRLEBWLGRTL 
PGSDWTPNAOFITPFFGFRBWPSKPRWQ*TRDLK\WGNPGTP*E 
GFEDK\VFyVWFDATIGYLSITANYTDQWERWW\KWPEQVDLYQ 
FMVAKDNVPFHSLVFPSSALGAEDKYTLWSHLIATEYLNYEDG 
K\FSKSRGVGVFRDM\AHDTGIPPDlSRFYL\LyiRPEGK\DSA 
FSWTDLLL,KNNS\E]XNNLGNFrNRA\GMPVSKFFGG\YVPEr4V 
LTPDDQRLLAXHVTLELQHYHQXLLEKVRIRDALRS IbTIS\RH 

gnqyixqvnepwXkrikgseadrqragtvtglavniaallsvml. 
qpymptvsatiqaqlqlpppacsilltnflctlpaghqigtvs? 
~- 1 i^fqklendqieslrqrfgggqaktspkpawetvttakpqqiqa 
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'SOOCID' <W0, , , 0153312A1 I > 



wo 01/53312 



PCT/USOO/34263 



SEQ 
ID 
WO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
seq[ucnce 


Amino acid segment containing signal peptide 
{A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G-Glycine, 
H=Histidine, I»Isoleucine, K=Lysine, 
T.sLeucine, M=Methionine, NsAsparagine, 
P« Proline, Q=Glutamine, R^Arginine, 
S=Serine, Ti-Threonine , V»valine, 
VJ>*Tryptophan, Y-Tyrosine, X=Unknown, *=:Stop 
Codon, /^possible nucleotide deletion, 
\s;possible nucleotide insertion) 








LMDEVTKQGNIVRELKAQKADKHEVAAEVAKLUJLKKQLAVA^ 
KPPEAPKGKKKK 


S897 


2967 


86 


HPSLLGAIPFYPPPSSPWPPPLyLFWNSHRKSRHFINQRGtKGE 
MRLFVS DGVPGCLPVLAAAGRARGRAEVLI STVGPEDCVVPFIiT 
RPKVPVLQLDSGNyLFSTSAICRYFF\LLSGWEQDDI,TKQWriEW 
BATEIiQPTLSAAI>yYl,\WQGKKG\EDVLGSVRRTLTHXDHSLS 
RQ\NCPFI*AGETBSLADIVLWGAI,YPLI*QDPAYIiPEELSALHSW 
FQTUSTQ\EPOQR\AARRLVIiKQ\QGVrALR\PYLQK0PQPSPA 
EGKGLSPXEPEEEELATLSEEEIAMAVTAWEKGLESLPPLRPQQ 
NPVLPVAGERNVL I TSALPYVWNVPHLGNI IGCVLSADVFAR YS 

DIY\RWFNISFDIFGRTTrPQQ\TKIT\QDIFQQI*t,KRGmiQD 

tveqlrcehcarfXiju^rfvbgvcpfcxsyeeargdqcdkcjgkli 
navslkkpqckvcrscpvvqssqhlfldlpklekrleewtigrtl 
pgsdwtpnaqfitpffgfrewps kprwq*trdi:»k\wgnpgtp ♦ E 

GFEDK\VFYVWFDATIGYLSITANYTDQWERWW\KNPBQVDLYQ 
FM\ AKDNVP FHSLVFPSSALGAEDNYTIi\ VSHt*IATEYl,NYEDG 
K\FSKSRGVGVFRDM\AHDTGIPPDISRFYI*\LY1RPBGK\DSA 
FSWTDLLl4KNlfS\ELLNNIjGKFINRA\GMFVSKPPGG\YVPEMV 
I*TPDDQRLLA\HVTLEIiQHYHQ\l,LEKVRIRDALRSll,TIS\RH 
GMQYI\QVIXEPW\KRIKGSEADRQRAGTVTGrAVNIAALLSVMI. 
QPYMPTVSATlQAQLQI^PPPACSILLTNFLCrLPAGHQIGTVSP 
liFQKLENDQI ESIiRQRFGGGQAKTS PKPAWETVTTAKPQQIQA 
LMDSVTKQGNIVRELKAQKADKNEVAAEVAKT.IJDLKKQLAVAEG 
KPPEAPKGKKKK 


5898 


2967 


86 


HPSuLGAIPFypppSSPWPPPLYLFWNSHRKSRHFIKQRGIHGE— 

MRLFVSDGVPGCLPVIAAAGRARQRA£VI,ISTVGPEDCVVPFt*T 

RPKVPVI,QLDSGNYLPSTSAICRYPP\I*LSGWEQDDI.TWQWLEW 

EATELQPTLSAALyYL\WQGKKG\EDVIX5SVRRTLTmDHSriS ^- > 

RQ\MCPFr»AGETre:SIiADIVLWGALYPLLQDPAYI»PEELSALHSW 

FQTLSTQ\EPCQR\AARRLVLKQ\QGVIiALR\pyLQKQPQPSPA 

BGKGLSPIEPEBEEIArriSEEEIAMAVTAWEKGIiESLPPLRPQQ 

NPVI#PVAGERNVI,ITSALPYVNNVPHIjGNI igcvlsaovparys 

RtiRQWNTLYI,CGTDEYGTATETKAL\BEGLTPQEICDlCYHIIHA 

DIY\RWPNISPDIFGRTTXTQQ\TKIT\QDIFQQLLKRGFVI<2D 

TVEQLRCEHCARP\LADRFVEGVCPPC3GYEEARGDQCDKCGKLI 

NAVBLKKPQCKVCRSCPWQSSQHLFIiDIiPKLEKRLBEWIiGRTL 

PGSDWTPNAQFITPFFGFREWPSKPRWQ*TRDI.K\WGIIPGTP*E 

GFEDK\VFYVWFDATIGYLSITANYTDQWERWW\KNPEQVDliYQ 

FM\AKDNVPFHSLVFPSSAIjGAEDNYTL\VSHLIATEYLNYEDG 

K\FSKSRGVGVFRDM\AHDTGIPPDISRFYL\I*YIRPEGK\DSA 

FSWTDi:*LLKmiS\ELIJINLGNFINRA\GMFVSKPFGG\YVPEMV 

LTPDIX?RI*IA\HVTI,ELQHYHQ\l4r*BKVRlRDALRSIIjTIS\RH 

GNQ YI \QVKEP W\KR IKGSBADRQRAGTVTGLAVNIAALLSVML 

QPyMPTVSATIQAQLQLPPPACSlIibTOFX,CTI*PAGEQIGTVSP 

LFQKLENDQlESnRQRFGGGQAKTSPKPAWETVTTAKPQQlQA 

LMDE\mCQGNIVREIiKAQKADKNEVAAEVAKI*LDLKKQLAVAEG 

KPPEAPKGKKKK 


S899 


326 


1078 


NCPKSKEPNGVRAPSLPSPLRAAMALSDVDVKKQIKHMMAPIEQ " 

EANEKAEEIDAKAEEEFNIEKGRLVQTQRLKIMEYYEKKEKQIE 

QOKKlLMSTMRNQARLKVLRARNDLISDIiLSEAKLRLSRlVEDP 

evyckslldklvlqgllrllepvmivrcrpXqdlllveaavqkai 

PEYMTISQKHVEV\QIDKEA*LAVECSWEVWEVYSGNQRIKVSK 
TLESRUDLSAKQKMPEIRMAIiFGANTNRKFFI 


5900 


64 


1409 


KAASRDSPCI^EFCPLCGVSSHDLQHRMWYHRLSHLHSRLQDLLK 
GOVIYPALPQPNFKSLLPLAVHWHHTASKSLTCAWQQHEDHFEL 
KYANTVMRFDYVWLRDHC3iSASCYNSKTHQRSI*DTASVDLCIKP 
KTIRIiDETTLFFTWPDGHVTKYDLNWLVKNSYEGQKQKVIQPRI 
LWNAE lYQOAQVPS VDCQS FLETNEGr*KKFLQNFU*YGI AFVEN. 
VPPTQBHTEKliABRISIilRETIYGRMWYFTSDFSRGDTAYTK^l 
LDRHTDTTYPQEPCGIQVFHCLKHEGTGGRTI.r.VDGFYAAEQVIi 
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BNSOOCIO: <WO 0153312A1_L> 



wo 01/53312 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
Co first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, Is^lsoleucine, K=Lysine, 
L=Leucine, M=Mftthionine, N=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S^rSerine, Ts= Threonine, V=Valine, 
W=Tryptophan, Y«Tyrosine, X=Unknovm, *-Stop- 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








QKAPEEFErjL>SKSAl\KHEYIEDVGECHQPHDWDWAQS*"3:5fHS~~ 
/YKELYI^IRYNNYDRAVINTVPYDWHRWYTAHRTLTIELRRPE 
KEFWVKLKPGRVLFIDbJWRVLHGRECFTGyRQLCGe^LTRDD\rti 
NTARLLGLQA 


5901 




2121 


vaieqtslkmkqavggaparptgeyicnqcgakytsldsfqthlt " 

kthldtvlpfcltcpqcwkefpwqesllkhvtihfmitstyyice 

scdkoftsvddlqkhtildmhtfvffrctlcqevfdskvslqujl 

\avkhsnekkvyrctsci^dfrnetdi<}lhvkhnhlenOgicvhk 

crfcgesfgtevelqchitthskkynckfcskafhairllekhi* 

RE KHC VFETKTPNCGT?JGAS EQVQ KEB VE ZjQTLLTNSQESHNSIi 
DGSEEDVDTSBPMYGCDICGAAYTMETLLQNHQLRDHNIRPQES 
AIVKKKAELIKGNYKCNVCSRTFFSENGLREHMQTHLGPVKEYM 
CPICXjBRFPSIitiTIiTEHKVTHSKSLDTGNCRICKMPl^QSEEEFIi 
EHCQMHPDLRNSLTGF^CWCMQTVTSTLELKIHGTFHMQKTGN 
GSAVQTTGRGQHVQKLYKCASCLKEFRSKQDIiVKLDINGLPYGL 
CAGCVRTjSKSASPGINVPPGTWRPGLGQNENLSAIEGKGKVGGL 
KTRCS*LATFKF*VLKVELPEPHPKPFHRGVSRPDSNSTQLKTP 
QVSPMPRISPSQSDEKKTYQCIKCQMVFYNEWDIQVHVANHKID 
EGLWHECKLCSQTFDSPAKLQCHLIEHSFEGMGGTFKCPVCFTV 
FVQAtnCLQQHIFSAHGQEDKIYDCTQGPQKFFFQTEI^NHTMTQ 
HSS 


5902 


712 


209 


LKNRRRSRPS 1 RQS IGSTSVSRWLTSLFTYLDHTADVQ * V* REF 
IPL:<PRQ*ED*MFQSWLHAWaDTLESAFEQCAMAMFOYMTDTGT 
VEPLQTVKVETQGDDLQS LLFKFLDEWIiyKFSADE PPI P \GWGB 
EFSIiSKHPQGTEVKAITYSAMQVYNEENPEVFVIIDI 


5903 


2lb6 


73S 


dtpgpslpsttapfsurslsppsrpsyllpgdpqplqgrglptt ' 

PALFALSAVPGGAASPMPPSGLRLLPLLLPLSiMI*LVI,TPGRPAA 
GLSTCKTIDMEIiVKRKRXEAIRGQILSKLRLASPPSOGEVPPGP 
LPEAVniiTU^YNSTODRVAGESAEPEPEPEADYYAKEVTRVLMVET 
HNErYDKFKQSTHSIYMPFNTSELREAVPEPVI.LSRAEI,RLr*RI» 
KLKVEQHVELYQKYSNNSWRYLSNRLbAPSDSPEWLSFDVTGW 
RQMLSRGGEIEGFRI>SAHCSCXiSRDNXLQVDINGFTTGR\RaDr. 
ATI HGMNRPFLIiLMATPLERAQHLQS \SRHRQAL\DTNY\CFSF 
HGGRMCLRC/VHC*HLI FRKDL\GM\ KWI \HE\ PKGYHANFC\L 
GPCPYIWSLDTQYSKVIiAliYNQ\HKPG\ASAAP\CC:VPQALEP\ 
LPIVYY\VGRKPKVEQIiSNMIVRSCKCS 


5904 


3 


1126 


WMEErEt5AINT?KEEQRI.IYEELIKEEKTTNNELSAISRKiDTW 
ALGNSETEKAFRAISSKVPVDKVTPSTLPEEVLDFBKFLOQTGG 
RQGAWDDYDHQNFVKVRJTKHKGKPTFMEEVLEHLPGKTQDEVQQ 
HEKWYQKPLALEERKKESIQIWKTKKQQKREBIPKLKEKADNTP 
VLFHNKQEDNQKQKEEQRKKQKUVVEAWKKQKSIEMSMKCASQL 
KEEEEKBKKHQKERQRQFKIiKLLLESYTQQKKEQEEFLRLEKEI 
REKAEKAEKRKNAADElSRFQERDLHKLEIiKILDRQAKEDEKSQ 
KQRRLAKLKEKVENKVSRDPSRLY/NTHQRIiGRTMQKDRTNRLW 
ATSTYPT^GYSNLETRNTEKSMR 


5905 


287 


2912 


POASFPPRVNEKEIVRLRTIGELrAPAAPFDKKCKRENWTVA^ 
DLifa X AWbgGHRTVKI*VP WSQCXQNFLI»HGTKbn/TNrSSSLRIi?R 
QNSDGGQKNKPREHIIDCGDIVWSLAPGSSVPBKQSRCVNIEWH 
RFRFGQDQLLIATGLWSGRIKIWDVYTGiaUUjNLVDHTGVVRDL 
TFAPDGSLILVSASRDKTIJIVWDLRDIXJN\MMKVLRGHQN&7VY\ 
SCAFS PDSSMLCS VGASKAWAAILV* LRLCWHHSHTGATM VLS 
WAERVASliATGIiGATFTIG*SNLAFVIiQGVLYVHRCWSMSTFCF 
SFFLFFFPKVISPTVKYH*LLSJaiFQFYGIGSLTSETNLM*SI 
WLSNGFSVLFFGILSDSRDII,RL*FNLKFVI,IFF*K*CIVSVQK 
KKKPKR I ALIiQEERLS *DKPPSSHLI *QTEVNIRILiFRAI LHS * 
LLIFRI *NCI *TYS * IIDPFYIQMTYDRG*FGKNKMVKF*PIEM 
* liYYFHKI AFSFCNW*HPCCU?KKFHIAVNILFACS ICFSS * A 
QVGDPSl,L*TSDYLKGRCQWSNNliLTLRFLSVYFFKlJLWSGKK 
REGGL* YLTLFIS VYFS * LVFGINGFQYS FWKIiHCLYFMFRIJ 
FKLTFNRNI *NRICMSALINLKTDFNLTMTJ[,SIFFKLLI lYMA* 
YNI^*I*QF*YKMCHFVX*CMSE*SYNICI,FIAGF\LWNMDKYTM 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=:Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G«Glycine, 
H=»Histidine, I = Isoleucine, KrsLysine, 
L=Leucine, M=Methionine, N-Asparagine, 
P=Proline, Q-Glutamine, R-Arginine, 
S=Serine, T^Threonine , V*«=Valine^ 
W=Tryptophan, Ys=Tyrosine, XaUnknown, *-Stopr 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








IRKLEGHHHDWACDFSPDGALLATASYDTRVYIWDPHNGDILM 
EFGHLFP PPTP I FAGGANDRW VRS VSFSHDGLHVASLADDKMVR 
FWRIDEDYPVQVAPIiSNQIiCCAFSTDGSVIiiiAGTHDGSVyFWAT 
PRQVPSLQHL.CRMS IRRVMPTQEVQELP IPSKTiliEFLSyRI 


590 6 


X4 6 




pgehstdnwrtypsiqimnyygkgkv\ritlvtk\ndpvkphph 

DLVGKDCRD\GyYEAE FGQEXRRP \IiFFQN\LGIRCVKKKEVKE 
A\IITR\IKAGINPFDVP*KQIiNDIEDCDU>VVRLWFRVFLPDG 
HGNL\TTALPPV\VSSPiyDNRAPNTAELRVCRVNKNCGSVRGG 
DEIFLLCDKVQKDDIEVRFVLNDWEAKGIFSQADVHRQVAIVPK 
TPPYCKAITEPVTVKMQLRRPSDQEVSESMDPRYLPDEKDTYGN 
KAKKQKTTLIiFQKLCQDHVETGFRHVDQDGLELLTSGDPPTLAS 
QSAG r TVN FPERPRPGIiLGSIGEGRYFKKEPKLFSHDAWREMP 
TGVSSQAESYYPSPGPISSGLSHHASyiAPLPSSSWSSVAHPTPR 
SGNTNPLSSFSTRTLPSNSC2GIPPFI.RIPVGKDliNASNACIYNN 
JiDDIVGMEASSMPSADLYGISDPNMuSNCSVNMMTTSSDSMGET 
DNPRIiliSMNLENPSCNSVriDPRDLROLBQMSSSSMSAGANSNTT 
VFVSQSDAFEGSDFSCADKSMINESGPSKSTfiTPKSHVFVQDSQY 
SGIGS MQNcQIjSDS F P YE FFQV 


5907 


99 


1873 


TYhhSSVtSS * * WLDTKIKSQVKV/RKGHKKISWPYPQPAKQMGK 
KATSKVPSAPHFVHPNDHANREAELKKKWVEEMREKQQAAREQE 
RQKRRTIESYCQDVLRRQEEFEHKEEVLQEIiNMFPQLDDEATRK 
AYYKEFRKVVEYSDVILEVLDARDPLGCRCPQMEEAVXiRAQGNK 
KI*VIiVIJSrKIDLVPKEVVEKWLDYIiRMEIiPTVAFKASTQHQVKN^ 
JNK^o V yvL^nZyctaLuaPia i\J\\^c\3AaaiJt'li\vLAjtci X ^KJjuJc^ VK X nJLK 
VGWGLPNVGKSSIiINSLKRSRACSVGAVPGlTKFMQEVyi.DKF 
IRLi:jDAPGIVPGPNSEVGTII»RNCrviIVQKLADPVTPVETII.QRC 
NLEEISNYYGVSGFQTTEHFLTAVAHRLGKKKKGGLYSQEQAAK 
AVLADWVSGKISFYIPPPATHTLPTHLSAEIVKEMTEVPDIEDT 
EQANEDTMSa^TGESDKLLCDTDPLEMEIKLLHSPMTKIADAI 
ENKTTVYKIGDLTGYCT2. r.^RHQMGVJAKRNVDHRPKSNSMVDVC 
SVDRRSVLQRIMBTDPLQOGQAIiASALKNKKKMQKRADKIASKL 
SDSMMSALDLSGNADDGVGD 


S908 


247 


975 


HOtIKKRGEGSGSPSPASGGFQIjGCQIPEPSLPSEEETHPHTRA 

htrtlratlxrrpprshstrlrfpmpldgdgglaswk/pmrer* 
gwrrpakaag7vslgvaatgkrgcrms kryliqkatkgklli 1 1 fi 
\m.wgkvv*ssam!hkmhvktgtcewaxihrccnknki eersqt 
vkcscfpgqvagttraapscvdasiveqkwwchmqpclegeeck 
vlpdrkgwscssgnkvkttrvth 


5909 


1 


5002 


pai pgsti iwapgshsaaradgrhgslipsqsqapgaiicgarapp 
ssnlradrsmicaqaragkhlyhnrflgjlaamafpsrnsqslrr 
ckepirysvnpdqfhnmdlrggprdgvtiprstsbtdiivtsdsr 
stimgrssyysighsqdiivihwdikeevdagdwigmylrdevls 
enfldyknrgvngshrgqi iwkidassypvepetkicfkyyhgv 
sgaijwia^psvtvknsaapifksigadetvqgqgsrriiisfsls 
dfqamglkkgmffnpdpylkisiqpgkhsippalphhgqerrsk 
i igntvhp i wqabqfsfvslptdvxieievkdkfaksrpi ikrfl 
gklsmpvqrltierhalgdrwsytlgrriiptdhvsgqlqfrfei 

TSSIHPDDEEISLSTEPESAQIQDSPMNNLMESGSGEPRSEAPE 
SSESWKPEQU3EGSVPDRPGNQS I ELSRPAEBAAVITEAGDQGM 
VSVGPEGAGELLAQVQKDIQPAPSAEEIAEQLDLGEEASAMiLE 
DGEAPASTKEEPLEEEATTQSRAGREEBEKEQEBEGDVSTIiEQG 
EGRLQLRASVKRKSRPCSLPVSELBTVIASACGDPETPRTHYIR 
IHTLLHSMPSAQGGSAAEEEDGAEEESTI.KDSSEKDGI»SEVDTV 
AADPSALEEDREEPEGATPGTAHPGHSGGHFPSLANGAAQDGDT 
HPSTGSESDSSPRQGGDHSCEGCDASCCSPSCYSSSCYSTSCYS 
SSCYSASCYSPSCYNGNRPASHTRFSSVDSAKISESTVFSSQDD 
EEEEWSA.FESVPDSMQSPELDPESTNGAGPWQDEIAAPSGHVER 
SPEGLBSPVAGPSNRREGECPILHNSQPVSQliPSItRPEHHHYP| 
IDEPLPPNWEARXDSHGRVFYVDHVNRTTTWQRPTAAATPDGMR 
RSGSIQQMEOUTRRYQNXQRTIATERSEEDSGSQSCEQAPAGGG 
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SEQ 1 Predicted 
ID j beginning 
NO: nucleotide 
locjation 
j corresponding 
[ to first 
amino acid 
residue of 
J amino acid 
j seqtxence • 


t-itsdicLed end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acad segment: containxng signal p"i^tld^ 
(A=Alanine, C=Cysteine, D=.Aspartic Acid, 
Glutamic Acid, F=Phenylalanine. G-Glycine, 
H=H:Lstidine, Ii=lsoleuciue, K«Lysine, 
L-Leucine, M^Methionine, N=:Asparagine 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T^Threonine, V= Valine 
W-Tryptophan, Y=Tyrosine, x*.UnJcAown, *^3toi> 
Codon, /=possible nucleotide deletion, 
\=pos£xble nucleotide insertion) 


S910 1526 " 




U^GGSDSEASSSQSSLDLRREGSLSPVNSQKITLbUjSPAVgFl-- 

TNPEFFTVLHANYSAyKVFTSSTCl.KHMXLICVRRDARNFERYOH 

NRDLWPINMFADTRLELPRGWEIKTDQQGKSFFVDHNSRATTF 

IDPRIPLQNGRLPNHLTHRQHLQRLRSYSAGEASEVSRNRGASL 

LARPGHSLVAAIRSQHQHESLPLAYNDKIVAFLRQPNIFEMLOE 

RQPSIiARNHTLREKIHyiRTEGNHGLEKLSCDADI,VII.LSXiF?E 

ElMSYVPLQAAFHPGYSFSPRCSPCSSPQNSPGUiRASARAPSP 

YRRDFEAKLRNFYRKLEAKGFGQGPGKXKLlIRRDHIiEGTFNO 

VMAYSRKELORKKLYVTFVGSEGLOYSGPSREFFFLI^SOEIiPNP 

YYGI.FBYSANDTYTVQISPMSAFVENHI.EWFRFSGRILG\IALI 

HQYLLDAFFT\RPFYKALL\RLPC\D\LSDI,EYM)KEFHQSLQW 

MKDNNITDIL0LTFTVNEEVFGQVTERELKSGGANTQVTEKNKK 

BYJBRNVKWRVERGWCQTEALVRGFYEVVDSRI.VSVPDAREX,B 

LVIAGTAEIDLNDWRNNTEYRGGYHDGHLVIRWPWAAVERPNNE 

QRLRLLQFVTGTSSVPYEGFAAPPWEPMGLRRPLP*KKWGKITS 

LPPRGNHTCLQPDWDLPTVSPRTPMLYEKVLLTAWEKTSTPrta. 


S911 109 


446 VAi:;r-AAMEPGRTQIKLDPRYTADLI,BVI,KTWV0IPSACFSQpOT 

AAQLLRALGPVElALTSILTLXiALGSIAIFLEDAVYLYKNTLCP 

IKRRTLLWKSSAPTWSVLCCFGLWIPRSIlVLVE^3TXTSPYAVC 

FYLiLMLVMVEGFGGKEAVLRTLRDTPMMVHTGPCCCCCPCCPRI, 

LLTRKKLQ\R*CWALSNTPS*R*R*PWWACFSSPTASMTQQTPI. 

RGACLYGSTI^SA/CSTLIALWTLGIISRQARIJaXSEQNMGAKP 

ALFQVLLILTALQPSIFSVIANGGQIACSPPYSSKTRSQVMNCH 

Lr*ILBTFLMTVLTRfyiYYRRKDIIKVGYETPSSPDIJ>I,Nt,KAI,RWM 
1 AWTMKGCCTH 




595 QiiPLAPCIQGKGLtEMRSPKPQSyxiRSSHSGAGLLVkNPSTPVF— 
CGHRRGGAAFKYKPTPWGPEQRPTGQKHMRGGVSLLSPRLECS 
GTISAHCNTLRLPSSSNS PAPAS * LAGITGVCHHAQLIFVFLVET 
GFHHVGQAGriELL/NWIHLpRPPKVLGLQA 


5912 924 

1 — — — 


277 WIIJ^l^WIX3AIaALrrVMSPCGGEDIVADHVASYGVNLYQ5YGP ' " 
SGQYSHEFDGDEEFYVDLERKETVWQLPLFRRFRRFDPQFALTM 
lAVLKHKLNIVIKRSNSTAATNEVPErTTVFSKSPVTLGQPNTLI 
CLVDNIFPPWNXTWLSNGHSVTEGVSETRPSSPKSDHFrMDQ 
VTSPSFPFE**DL*TAKVEQLGAWFEPLLKHWGAEIPTTL 


5914 


! ^ 


llyb j QLRMAGAEGAAGkU±^ELKPVVSLVDVI.EEDEEI>BNEACAVLGqS 
1 DSEKCSYSQGSVKRQALYACSTCTPEGEEPAGICIACSYECHGS 
HKLFELYTIOiNPRCDCGNSKFKNLSCKLLPDKAKVNSGNICXKDN 
FFGLYCICKRPyPDPEDEIPDEMXQCWCEDMFHGRHIXSAXPPE 
SGDFQEKVCQACMKKCSFLWAYAAQLAVTKIST\GMMDWCGTLM 
E* /DDQEVIKPENGEHQDSTLKEDVPEQGKDDVREVKVEQNSEP 
CAGSSSESDLQTVFKNESI^AESKSGCKLQELKAKQLIICKDTAT 
yWPLNWRSKLCTCQDCMKMYGDLDVLPLTDEYDTVIAYENKGKI 
AQATDRSDPLMDTLSSMWRVQQVELIC/GIQ*FED 




5915 


960 


124 NI^SELPPEEALFIQVASMNQRRVDPYIASIEDMLVAr/GGRN" 

ENGALSSVETYSPKTDSWSYVAGI.PRFTYOHAGTIYKDFVyiSe 

GHDYQIGPYRKNLIiCYDHRTDW/BERRPMTTARGWHSMCSLGDS 

lYSXGGSDDNXESMERFDVLGVEAYSPQCNQWTRVAPLLHANSE 

SGVAVWEGRIYII/3GYSWENTAFSKTVQVYDREADKWSRGVDLP 

KAIAGGSACFIAP*SLGQRTRKRICAKARGTRTGASDPSCASWDH 
PHRHLPGIiCRPAATS 




5916 


1604 


^03 ^'PGRPTRPLtOGRRRKIiARXIQAPHaiSPRPRTCPPGALQAPEA 
PASRAEGPVAVWNGHTEGPAPARSAPKEPPGLPRPLGSFPCPT 
PQEDFPALGGPCPPRMPPSPGFSAWLLKGTPPPPPPGLVPPIS 
KPPPGFSGLLPSPHP\PVSPAPPPPPPQK/RPRLLPAP /PGLPS 

PRELPGEEPSAHPVHQqLPAERRGPLQRVQEPLRGVQTGPDLRS 
PVLQELPGPAGGEPPEGIi* *AAGPAAH 




5917 


256 
1343 ( 


633 SPRMWEIWGPWHRWESFSLEGEWPSRIPEPSPDSTkgTSGKGCR" 
TVTGAVHRHLNHVAGIIPWVLHSQLKPTAATAQDQWTSOQYPDH 
1 PTRLILQ*NQATAr>KNN*TTALLQPHQRL\VSPRMAEA 4 
_ 827 1 AHQILTYLEP/lCLVVKYNKIIiTVFItTKSVirEJ^KFIHTPQTYR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acxd segment containing signal pepTIHr 
(A^Alanine, C= Cysteine, D^Aspartlc Acid, E=. 
Glutamic Acid. F-Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=iLysine, 
L«Leucine, M=Methionine, N^Asparagine, 
P^Proline, Q»Glutamine, R=:Arginine, 
S=Serine, T=Threonine, V-Valine, 
W^Tryptophan, Y^Tyrosine, X=:UnJcnown, *:=Stop 
Codon, /arpossible nucleotide deletion, 
\ f vii^o xijic; iiu^xcuLiiue insertion) 










--'^WUFFGIKBVYVSRRIJ^KTSF/felAVTFLEQAWSKECVPYDQ- 

FMEHLLPSLLSLASDPVPNVRVJCtAKAUlQMLLEKAYFRNASNP 

HLEVIEETirALQSDRDQDVSPPAALEPKRRMIIDTAVLEKON 




S918 


13 


1247 


EGAQVARRRSRRQWRAGRCGRGRGGRRAERTGGRGPPGRPRPLP" 

PGPARRGRRRMETPFYGDEALSGLGC-GASGSGGTFASPGRI.PPG 

A?PTAAAGSMMKKDALTLSIiSEQVAAALKPAPAPASYpPA\ADG 

A?SAAPPDGLtASPDLGU:,KlASPELERLXIQSNGLVTTTPTSS 

QFLYPKVAASEEQEFAEGFVKALEDLKKQNQLGAGRAAAAAAAA 

AGGPSGTATGSAPPGEIAPAAAAPEAPVYA\NLSSY\AGGCRGli 

RGGAAT\VAPAAEPVPPPPPPPPGALGPRRP/RLALQGRRPQTV 

PDVP\SFGESP\PI,SPIET\DTPRRI\KAKRKRL\RKrPQIRAPK 

PASRKLGAQSRALERESEDPS*SPEHGSLASTASLI*REQVAOLK 

QKVLSHVNSGCQLLPQHQVPAY 




5S20 


1 


42S4 


TSVQGDSQGTPTSSQGSINMEHWtSQAIHGSTTSTTSSSSTQSQ 

GS6AAHRLADVMAQTHIENHSAPPDVTTYTSEHSIQVERP0GST 

GSRTAPKYGNAELMETGDGVPVSSRVSAKIQQLVNTIiKRPKRPP 

LREFPVDDFEEIiliEVOQPDPNQPKPEGAQMLAMRGEQLGWTNW 

PPSLEAAUSRWGTISPKAPCLTTMDTNGKPLYILTYGKLWTRSM 

KVAYSIIJiKl^TKQEPIWRPGDRVALVFPNiroPARFMAAPYGCr. 

lAEWpVPIEVPLTRKDAGSQQlGFLLGSOGVTVALTSPACHKG 

LPKSPISGEIPQFKGWPKLLWFVTESKHLSKPPRDMPXPHIKDAM 

NDTAYIEYKTCK\DGSVLGVTVTRTALLTHCQALTQACX3YTEAE 

TIVNVLDFKKJJVGLWHGILTSVMNMMHVISIPYSLMKVNPbSWI 

QKVCQYKAKVACVKSRDMHWALVAHRDQROINLSSIiRMLIVADG 

ANPWSISSCD2VF1*NVFQSKGLRQEVICPCASSPEALTVAIRRPT 

DDSNQPPGRGVLSMHGLTYGVIRVDSEEKLSVLTVQDVGLVMPG 

AIMCSVratDGVPQLCRTDEIGELOrcyVVAaXJTSYYGLSGMTKNT 

FEVFAMTSSGAPISEypPXRTGLLGFVGPGGI,VFWGKMDGLMV 

VSGRRHNADDIVATALAVEPMKFVYRGRIAVFSVTVr^HDERXVI 

VAEQR?DSTEE0SFQWMSR\aiQAlDSIHQVGVYCLALVPANTLP 

KTPLGGIHLSETKQLFLEGSLHPCNVLMCPHTCVTNLPKPRQKQ 

PBIGPASVMVGNLVSGKRIAQASGRr>LGQIEDNDQARKFI,FI^E 

VLQWRAQTTPDHIIiYTIiLNCRGAXANSLTCVQLHKRAEKIAVMri 

MERGHI^QDGDHVALVYPPGIDLIAAFYGCLYAGCVPITVRPPHP 

CNIATTLPTVKMIWSKSACLMTTQliICKLLRSREAAAAVDVR 

TWPLlI,DTDD*PKKRPAQICKPCNPOTIAYLDPSVSTTGMrAaV 

KMSHAArSAFCRS IKLQCELYPSREVAICU5PY0GLGPVLWCLC 

o V xovjfxyaxitx f fisttiitt iWPAiiWijijAVSQYKVRDTFCSysVMEIL 

CTKGLGSQTESLKARGLDLSRVRTCVWAEERPRIALTQSFSKL 

FKDLGIJIPRAVSTSFGCRVNriAICLQGTSGPDPTTVYVDMRAi;,R 

HDRVRLVERGSPHSIiPLMESGKILPGVRIIIANPETKGPLGDSH 

I/3EIVrVHSAHNASGYFTIYGDESLQSDHFNSRLSFGDTQTIWAR 

TOYLGFLRRTELTDANGBRHDALYWGALDEAMELRGMRYHPID 

rETSVIRAHKSVTECAVPTWTNLLWWElUDGSEQEArjDLVPLV 

TNVVI,EEHYLIVGVVVVVDIGVIPINSRGx2KQRMHi:.RDGPLADn 

LDPIYVAYNM ^ 






5521 


1381 


1499 


QLGAVAHAGVSRIPP*LFPPI,HPTFnSLWCIiHHKLP /HPPGASM " 
VRPPWPRRPPAHISSVRQASTQVPRTVPHTQRVAKIGTQTTGP 
SGVGCC7PGRPI>LPCKCSSAAHSTYRVQEPAVHIPGQEPI*TASM 
IiAAAPLHEQKQMIGERLYPLIHDVHTQIAGKITGMLLEIDNSEL 
LLML-ES PESLHAKIDEAVAVLQAHQAMEQPKAYMH 




5922 


727 
2475 


1S7 

1 

495 ' i 


VCPGTGGE*GLweQl^GLPKETPLKPMDAFTG£«LKRkFDD>TO^ 

GSSVSNSDDEISSSDSADSCDSLNPPTTASPTPTSIl,KRQKQliR 

RKinmFDQVT\nrYFARRCX3FTflVl»SCXK3SSLGMAQRHNSVRSY^ 

LCEFAQEQEVNHREIIJ?EHLKEEKliHAKKMKLTKNGTVESVEAD 

3LrLDDVSDEDIDVENVEVDDYFFLQPI.PTKRRR7U,LRASGVHR 

IDAEEKQELRAIRLSREECGCDCRLYCDPEACACSQAGIKCQVD 

RMSFPCGCSRDGCGWMAGRIEFWPrRVRTHYLHTIMKLELESKR 

3\GAAQQPQ\*GALPDCQLQPDRSTGJ:.*DPSWIGSKGI.SFTGKa 

\AATHI,I ILRVIENRGAEGKRK ^ 

nrSNWGLFPSVFJQVPRSRTQNLKPIFIjFYSYYE\CMETIJCG\T ' " 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re sponding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D-Aspartic Acid', E-* 
Glutamic Acid, Phenylalanine, G^Glycine, 
H=Hiscidine, I=lsoleucine, K-Lysine, 
L- Leucine , M-Methionine , N^^Asparagine , 
P= Proline, Q^Glutamine, R^Arginine, 
S=Serine, T-Threonine, V»Valine, 
W=Tryptophan, Y-Tyroaine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








CLYNATQYKVCSPRNDRPDACYNPS E PAATTVFEIRTGLtiLGDT^ 

S KI I TRTEEKEI PKQl TLRFDACAAINS KKLE IGCGSLN <t Ellfe ♦ 

RVENKYVCHESGVCKNCAYWPCVI * AT*KKNKNDSVyLQKGEAN 

PSCAAGHCNPLELIITNPLDPHWKKGERVTLGINRTGLKPQWI 

LI KGEVHKCS PXPVFQTPYEBLNLPAPELriKKTKNLFLQIiAENV 

IFLLNGTSCyVRGGTTlGDRWPWEA*ELVPTDPAPDIIPI*KAE 

ASNP*VLKTSIIRQYCIAREaKDFIIPVGKPNClGQKI*YNSTTK 

TIT * * DLifflTEKNPFSKFSKLKTA* AHAESH *DWTVPSGLY* IC 

RHRAYFRIiPNKWADSCVIGTIKPSFFLLPIKMGELLGPSVYASR 

EKKGIVIGNWKDNEWPRERIXQYYGPATWAQDGSWGYR/TP/VY 

MLNWI IRLQAILEIISNETGRALTVLAWQETQMRNAIYQNRLAIi 

DYLLVAEGGVCRKFNLTMCCLQINDQGQWKNIVRDMTKLAHVP 

IQVWHKFDPESLFGKWFPAIGGFKTLIVGVLLVIRTCLLLPCVL 

PLLFQMI KG IVATLVHQKTS AHVNYMNHYRSISQRDSKSEDESE 

MSH 


5923 


137 


636 


QLCGRRGQRFRTSIKRMHPI * RTCPNTNL/ IILL^QENTQIRtH. 
CK3BNRELWISLEEHQnALELI^^KYRKQMLQLMVAKKAVDAEPV 
LKAHQSMSAEIESQIDRICEMGEVMRKAVQVDDDQFCKIQEKIiA 
QLELENKEtiRELLS ISSESLQARKBNSMDTASQAI K 




274 


2146 


EKGKVKDAGAEQWI SLSLS CKGSWETQFSNHLNSLTPPTS VRRM " 

PLITTVTIiLKMVARHHKKIiLCSKAFSTQIiQQKIFLHSQKGIHHQ 

SVCMKLKPNTSHIISILMGQPMALVQI*ETIjAPIjTIIIQKFQTQD 

HMKFWKin:^PLHSHHi:.TPSVPCrrVIPKKTGSPE I KLKITKTIQNG 

REI,FESSLCX5DLI*KKVQASE\Q*NQSIESEUCEKRKKSNKK0SSR 

SEERKSHKIPKLEPEEQNRPNERVDTVSEKPREEPVIiKEGSPSS 

ANTlFCSNNGSVHW\FKFQVGDLVWSKVGTYFWWPCMVSSDPQtt 

EVHTKIWTRGAREYHVQFFSNQPBRAWVHEKRVREYKGHKQYEE 

LLAEATKQASNHSEKQKIRKPRPQRERAQWDIQIAHASKALKMT 

REERIEQYTFIYIDKQPEEALSQAKKSVASKTEVKKTRRPRSVL 

NTQPEQTNAGEVASSLSSTEIRHHSQRRHTSAEEEKPPPVKIAW 

KTAATUlKSIiPASITMHKGSLDLQKCNMSPWKIEQVFALONATG 

DGKFIDQFVYSTKGrGNKTEISVRGQDRLIISTPNQRNEKPTQS 

VSSPEATSGSTGSVEKKQQRRSIRXRSESEKSTEWPKKKIKICE 

QVGFLHVES 


5925 


216 


1911 


MMTAESREATGLSPQAAQEKDGIVIVKVEBEDEEDHMWGQDSTL 
QDTPPPDPEIPRQRFRRFCYQNTFGPREALSRLKELCHQWLRPE 
IKTKEQILEDr,VLEQFLSILPKELQVW3:iQEYRPDSGEEAVTLLE 
DLEIiDLSaQQVPGQVHGPEMLARGMVPLDPVQESSSFDLHHEAT 
QSHPKHSSRKPRIaLQSRALPAAHIPAPPHEGSPRDQAMASALFT 
ADSQAMVKI3DMAVSLI LEEWGCQNLARRNLSRDNRQENYGSAF 
PQGGENRNENEESTSKAETSEDSASRGETTGRSQKEFGEKRDQE 
GKTGERQQKNPEBKTRKEKRDSGPAIGKDKKTITGERGPREKGK 
GLGRSFSLSSNFTTPEEVPTGTKSHRCDECGKCFTRSSSIiXRHK 
I IHTGEKPYECSECGKAF\SLNS \nlvi,hqri \htgekphecne 
CGKAFSHSSNblliHQRIHSGEKPyECNECGKAFSQSSDXLTKHQ 

rihtgbkpyecsecgkafnrnsylilhrrvhtrekpykctkcgk 
Xaftrsstltlhhrihareraseyspasldafgaflkscv 


5926 


2 


233 


drclmucqgsqpgsppat/ceppappvyqapcqscpeppgahep 
sdsphhtpvhpppehsaacpapatccppprssms 


592'7 


414^ 


1248 


KHFSKPGSQAIiYQLKRPASGQNSISVMPAQKITKPAAKYGIPLA^ 

ykkygdkki>hekkplqkhkqahqtpekr\mtgeerrkiseeaar 
krrlefiekekkqkdqiislmkaeqmkrqekerlerinrareqg 
wrnvlsaggsgevkapflgsggtiapssfssrgqyehyhaifto 

MQQQRAEDNEAKWKRE I YGRGLPERQKGQIxAVERAKQVEEFLQR 
KREAMQNKARAEGHMGILQNIiABMYGGRPSSSRGGKPRNKEEEV 
YLARLRQIRLQNFNERQQIKAKLRGBKKEANHSEGQEGSEEADM 
RRKK\ IES LKAHANARAA^^KEQLERKRKEAYEREKKVWEEHIlV 
AKGVKSSDVSPPIiGQHETGGSPSKQQMRSVISVTSALKBVGVDS 
SLTDTRETSEEMQFCTNNAISSKREILRRLNENI.KAQEDEKGKQN^ 
LSDTFEItmrEDAKEHEKBKSVSSDRKKWEAGGQLVIPXiDEIiTJ 
DTSFSTT^RHTVGEVIiCLGPNGSPRRAWGKSPTDSVI^lLGEAJS 
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SEQ 
ID 
NO: 


Predicced 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
secjuence 


Amino acid segment containing signal peptide ' 
(A=Alanine, C=Cysteine, D=Aspartic Acid, 
Glutamic Acid, Fr^Phenylalanine, G=^Glycine, 
H^Histidine, I«lsoleucine, K^Lysine, 
L=beucine, M-Methionine, N^Asparagane, 
P-Proline, QaGlutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valane, 
W»Tryptophan, y^Tyrosine, X^Unknown, *==stop • 
Codon, /^possible nucleotide deletion, 
\»possibZe nucleotide insertion) 








IK2LQTELLENTTIRSEISPEGEKYKPLITGEKKVQCISHEINPr^ 

aivdspveckspefseaspqmslklegnleepddleteilqepS 

GTNKDE\SLPCTITDVWISEEKETKETQSADRXTIQElIEVSEDO 
VSSTVDQLSDrHIEPGTNDSQHSKCDVDKSVQPBPPPHKWHSE 
HLNLVPQVQSVQCSPEESFAFRSHSHLPPKNKNKNSLLlGIiSTG 
LFDANNPKMLRTCSLPDLSKLFRTLMDVPTVGDVRQDHliEIDEI 
EDENlKEGPSDSEDIVFEETDTDLQEtiQASMEQLLREQPGEEYS 

eeeesvlknsdveptangtdvadeddnpssesalkeewhsdnsd 
geiasececdsvfnhiieelrleleqemgfekffevykklkaihe 

DEDEMIEICSKIVCNILGNEHQHLYAiaLHLVMAOGAYQEDNDE 


5928 


4146 


1248 


KHFSKFGSyALYQLKRPASGQNSISVMPAQKITKPAAKYGiPIiA '" 

YKKYGDKKLHEKKPUJKHKQAHQTPEKRVNTGEERRKISEEAAR 

KRRLEFIEKEKKQKDQIISLMKAEQMKRQEKERLERINRAREtXS 

WRNVLSAGGSGEVKAPFLGSGGTIAPSSFSSRGQYEHYHAIPDQ 

MQQQRAEDNEAKWKREIYGRGLPERQKGQIAVERAKOVEEFLQR 

KREAMQNKARAEGHMG ILQNLAAMYGGRPSSSRGGKPRNKEEEV 

YLARLRQIRLQNFNERQQrKAKLRGEKKEANHSEGQEGSEEADM 

AKGVKSSDVSPPLGQHETGGSPSKQQWRSVISVTSALKEVGVDS 
SLTDTRETSEEMQKTNNAISSKREILRRLNBNLKAQEDEKGKQN 
I*SDTPEINVHEDAKBHEKEKSVSSDRKKWEAGGQI.VIPLDELTL 
DTSFSTTERHTVGEVIKLGPNGSPRRAWGKSPTDSVLKILGEAE 
LQLQTELLENTTIRSEISPEGEKYKPLITGEKKVQCISHEINPS 
AIVDS PVETKS PEFSEAS PQMSIiKLEGNIiEEPDDLETEILQEPS 
GTNKDE\ SLPCT ITDVWISEBKETKETQSADR IT! QENEVS EDG 
VSSTVDQLSDIH I EPGTNDSQHSKCDVDKSVQPE PFFHKWIISE 
HLNLVPQVQSVOCSPEESFAFRSHSHIiPPKNKNKNSrJMGLSTG 
LEDANNPKMLRTCSt.PDI^KriFRTLMDVPTVGDVRQDNIjBIDEI 
EDEKIKEGPSDSEDIVFEETDTDIiQELQASMEQIiLREQPGEEYS 
EEEES VI#?NSDV3S ?TANGTDVADEDDNPS SESALNEE WHS DNSD 
GEIASECECDSVFNHLEELRIiHriEQEMGFEKFFEVYEKIKAIHE 
DSDENIEICSKlVQN-JtiGNEHQHC.YAKILHIiVMADGAYQEDWDE 


5929 


3 


1558 


ttdfsmttqlpayvaillfyvsrascqdtftaavyehaailpnat 
ltpvsreealalmnrntjdilegaitsaadqgahi ivtpedaiyg 
wnftirdslypyledi pdpevnw i pcnnrnrfgqtpvq3riiscl\ 
aknnsiywanigdkkpcdtsdpqcppdgryqyntdwfXdsqg 
klvaryhkqnlfmgenqfnvpkepeivtfnttfgsfgiftcfdi 
iifhdpavtlvkdfhvdtivfptawmnvlphlisavefhsawamgm 
rvnflasnrhypskkmtgsgiyapnssrafhydmkteegklli.s 
qldshpshs awnwts yass lealssgnke pkgtvffdbfxfvk 
ltgvagtnflvcqkdlcchi^sykmsenipnevyalgafdguitve 

GRYYLQICTLLKCKTTNIjNTCGDSAETASTRFEMFSI^SGTFGTQ 
YVFPEVLLSENQLAPGEFQVSTDGRLFSLKPTSGPVIirVTljFGR 
LYEKDWASNASSGL?AQARI IMLI VI AP I VCSbSW 


S930 


113 


6082 


RGNCFWIVPFTMAORTGIiEDPERYLFVDRAVIYWPATQADWTAK 

KLVWIPSERHGFEAASIKEERGDE\mVELAENGKKAMVWKDDIQ 

KMWPPKFSKVEDMAELTCLN£ASVI,HNLICDRyirSGI.XyTYSGLP 

CWINPYKNLPIYSEWIIEMYRGKKRHEMPPHIYAISESAYRCM 

LQDREDQSlLCTGESGAGKTENTKKVIQyiiAHVASSHKGRKDHtf 

IPGE\LERQIiLQ7VNPILESFGttARTVQNDNSSRFGKFIRINFDV 

TGYIVGANIETYIiLEKSRAVRQAKDERTFHl FyQI.LSG\AGEHL 

KSDLLIjEGFNNYRFIiSNGYIPIPGQ\QDKGNFRGDPGEAt>IHIMG 

FSHEEILSMLKWSSVLQFGNISFKKERNTDQASMPEMTVAQKI, 

CHIiLGMNVMEPTRAILTPRI KVGRDYVQKAQTKEQADFAVEALA 

KATYERLPRWLVHRINKALDRTKRQGASFIGILDIAGFEIFELN 

SFEQI/!INYT^IEICLQQLFNiITMFILEQEEYQREGIEWNFIDFGI, 

DLQPCIDLIERPANPPGVIiALLDEECWPPKATDKTFVEKLVQEQ 

GSHSKFQKPRQLKDKADFCIIHYAGKVDYKADEWtJyiKNMDPLND 

NVATLLHQSSDRFVAEIiWKDVDRIVGIilXJVTGMTETAPGSAYICr 

KKGMFRTVGQLYKESLTKLMATLRNTNPNFVRCIXPNHEKRAGK f 

LDPHIiViUDQIiRCNGVI.EGIRICRQGFPNRIVFQEFRQRYEIIiTP | 
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Predicted end" 
nucleotide 
location 
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sequence 



6082 



A^no acio segment containing-^IgH^i-pepHTd^ 
(A-Alanme, C^Cysteine, D-Aspartic Acid, E- 
Glutair.ac Acid, F-Phenylalanine, G=Glycine 
H^istidine, I=Isoleucine, K^Lysine 
Lt^Leucine, M^Methionine, N:.AeparagiAe 
P=Proline, Q*=Glutaraine, R=Arginine, 
S^Serine, T«Threonlne, V=Valine 
W^Tryptophan, Y=Tyrosine, X=tJn)cnown, *.stop 
Codon, /^possible nucleotide deletion 
\i=po3sxble nucleotide insertion) 



NAIPKGFMDGKUACKRMIRALk:LDPWl.YRiGQSKIFF^VLM^ 
Lfc:Kfc:RDLKITDIIIFFQAVCRGyLARKAFAKKQQQLSAL,ICV^ 
NCAAYLKLRHWQWWRVFTKVKPLLQVTRQEEELQAKDEELLKVK 
EKQTKVEGELEEMERKHQQLLEEKNIIAEQLQABTELFA^M 
RARLAAKKQELESILHDI>ESRVEEEEERKQILQNEKKKMQAHIQ 
DLEEQLDEEEGARQKLQI^KVTAEAKIKKMEEEILLLEDQNSKP 
lKEKiCLMEDRlAECSSQIAEEEEKA.fCNIAICCRNKQEVMrSDLEE 
RLKKEEKTRQELEKAKRKLDGETTDI^QDQIAELQAQIDELKLQL 
AKKEEELQGALARGDDETLHKNNALXWRELQAQIAELQEDFES 
EKASRNKAEKQKRDLSEEI,BALKTEI.EDTLDTTAAQOELRTKRK 

cevaei.fckaleeetknheaqiqdmrorhataleelseqleqaS 

FKANLEKNKQGLETDNKELACEVKVLQQVKAESEHKRKKLDAQV 
QEIiHAKVSEGDRLRVELAEKASKIiQNEUDNVSTLLEEAEKKGIK 
FAKDAASi:iESQr>QDTQEIiI^EETRQKLNLSSRIRQI.EEEKKSLO 
EQ0EEEEEARKKLEKQVLAIX3SQLADTKKKVDDDLGTIESLEEA 
KKKLbKDAEALSQRI^EKAIAYDKLEXTKNRLQQELDDLTVOIJ) 
HQRQVASNt.BKKQ\KKFDQU:AEEKSISARyAEERDRAEAEARE 
rarrKALSIJUlALEEALEAKEEFBRQNKQIJlADMEDLMSSKDDVG 

NMQAMKAQFERDLQTRDEQNEEKKRLLIKQVRELBAELEDERKQ 
RALAVASKKKMEIDl.KDLEAQIEAANKARDEVlKQLRfCLQAQMK 
DYQRELEEARASRDEIFAQSKESEKKLKSLEAEILQLQEELASS 
ERARRHAEQERDEIADEITNSASGKSALU)EKRRLEAR1AQLEE 
EIiEEEQSNMELLNDRFRKTTI^VDTlJaAEIAAERSAAQKSDNAR 

qqlerqkkelkaklqelegavkskfkatisaleakigqleeole 

QBAKERAAANKLVRRTEKKLKEIFMQVEDERRHADQYKEQMEKA 

karmkqlkrqleeaeeeatranasrrklqrelddateaneglsr 



RGNCFWIVPrXMAQRTGLEDPERYLFVDRA7j.lfNPATQADWTAK: 
KLVWIPSERHGFEAASIKEERGDEVMVEIAEiraKmiVNKDDI 
mVPPKFSKVEDMAELTCLWEASVLKNLKDRYYSGLlytTYSGLF 
CWINPYKNI^PXYSENIIEMYRGKKRHBMPPHIYAISESAYRCM 
LQDRpQSILCTGESGAGKTENTKKVIQYIAHVASSHKORKDHW 
IPGEVLERQXJ^QANPILESFGNARTVQNDNSSRFGKFIRXNFDV 

tgyivganietyixeksravrqakdertfhifyqllsgXagehi, 

KSDLrjOBOPNNYRFLSNGYIPlPGQ\QDKGNFRGDPGEAMHIMO 

fsheeilsmlicwssvlqfgnisfkkerwtdqasmpentvaqkl 

cmU3MNVMEFTRAILTPRIKVGRDYVQKAQTKE0ADFAVEALA 
KATYERLFRWLVHRINKALDRTKRQGASFIGILD2AGFEIFELN 

sfeqlcinytkeklqqlfnhtmpilbqeeyqregiewwfidfgl 

DLQPCIDLIERPANPPGVLALLDEECWFPKATDKTFVEKl^VQEQ 
GSHSKFQKPRQLKDKADFCIIHYAGKVDyKADEWLMKN^^DPLND 
NVATLtiHCSSDRFVAELWKDVDRIVGLDQVTGMTETAFGSAYKT 
KKGMFRTVGQLYKESLTKLMATLJ^NTNPNFVRCI IPNHEKRAGK 
LDPHLVLDQLRCNGVLEGIRICRQGFPNRIVFQEFRORYErLTP 
NAIPKGFrroGKQACERKIRALBLDPNLYRIGQSKIFFRAGVlAH 

NCAAYLKLRHWQWMRVFTKVKPLLQVTRQBEELQAKDEELLKVK 
EKQTKVEGELEEMBRKIIQQLLEEKMXLAEQLQAETEI^FAEAEEM 
RARIAAIOCQELEEILHDLESRVEEEEERNOIJXJNEKKKMQAHIO 
DLBEQI.DEEEGARQKLQLEICVTAEAKIKKMEEEILLLEDQNSKP 
IKEKKLMEDRIAECSSQLABEEEKAKNLAKIRWKQEVMISDLEE 
RLKKEEKTRQELEKAKRKLDGETTDLQDQIABLOAQIDELKLOL 
AKKEEELQGAIJUIGDDETLHKNNALKVVREI^AOIAELQEDFES 
EKASRNKAEKQKRDI^EELEALKTELEI>TU)TTAAQQELRTKRB 
QEW^LKKALEEBTKNHEAQIQDMRQRHATALEELSEQLEQAKR 
FKANLEKNKQGLETDNKEIACEVKVLQQVKAESEHKRKKIiDAQV 
QEUIAKVSEGDRLRVELAEKASKLQNELDNVSTLLEEAEKKGIK* 
FAKDAASLESQDQDTQEr,LOEETRQKLNC^SRIRQLEEEKKSuJ 
EQQBEEEEARKNLEKQVIjaogQIJU3TlCK3CVDDDr/3TlEfiT.K'ffa 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

c or re spond i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acxa segment containing signal peptide^ 
(A=Alanine, C=.Cysteine, D=Aspartic Acid, E=. 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Histidine, I-Isoleucine, K^Lysine, 
L» Leucine, M»Mcthionine, N=Asparagine , 
P-Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine, V»Valine, 
W-Tryptophan, Y-Tyrosine, X-Unknown, *=stop^ 
codon, /^possible nucleotide deletion, 
\=possible nucleotldf* ■inQ*ivt"»*-v«\ ' 


S932 






KKKI.LKDAIiAI.SQRLEEKAIAyDKLEKTK34RI^QELDDi;fVDBD 
HQRQVASNLEKKQNKKPDQLIAEEKSISARYAEERDRAEAEARE 
KETKAI^IJUiALEEALBAKEEFERQNKQLRADMEDLMSSKDDVG 
KNVHELBKSKRALEQQV\EEMRTQLEELEDELQATEDAKLRLEV 
ctis^uiju AJXi^iiUWisfcttJtKLJjIKQVRELEABLEDERKQ 
RALAVASKKKMEIDLKDLEAQIEAANKARDEVIKQLRKLQAQMK 
DYQREI*EEARASRDBIFAQSKESE!Cia.KSLEAEILQLQESIASS 
ERARRHAEOERDElADEITWSASGKST^XiDEKRRLEARIAQLEE 
ELBEEQSNMELLNDRFRKTTLQVDTLNAELAAERSAAQKSDNAR 
QQLERQNKELKAKLQELEGAVKSKPKATISALEAKIGQIiEEQLE 
QEAKERAAANKLVRRTEKKLKEI FMQVEDERRHADQYKEQMEKA 
KARMKQLKRQLEEAEEEATRANASRRKLQRELDDATEAWEGLSR 

EVSTLKNRLRRGGPISFSSSRSGRRQLHLEGASLELS0DDTESK 
TSDVNETQPPQSE 


5933 


33 


572 


RHLEEICFLFI.QKGRKLKLSGPRWEEGKPRGTGGI.WVKAEANMG 
FGATIAVGLTIPVLSWrr I ICFTCSCCCLYKTCRRPRPV\APP 
PHPP/PWHAPYPQPPSVPPSYPGPSYQGYHTMPPQPGMPAAPY 

PMQYPPPYPAQPMGPPAYHETLAGGAAAPYPASQPPYKPAYMDA 
PKAAL 


5934 


1 


3190 


UiKiU.K>aADKTPG(jiiCjKASSierRSSDVHSSGSSDAHMDASGP6t)~ 
SDMPSRTRPKSPRKHNYRNESARESLCDSPHQNLSRPLLENKLK 
APSIGKMSTAKRTLSKKEQEELKKKEDEKAAAElYEEFIiAAPEG 
SDGNKVKTFVRGGVVWAAKEEKET0EKRGKIYKPSSRPAJDQKNP 
PNQSSNERPPSLLVIETKKPPLKKGEKEKKKSNliELFKEELKQI 
QEERDERHKTKGRLSRFEPPQSDSDGQRRSMDAPSRRNRSSGVL 
DDYAPGSHDVGDPSTTXNPYIiGNlXNPQMNLiCKCCCGEFGRFGP 
LASVKIMWPRTDEERARERNCGPVAFMNRRDAERAUCNIjNGKMI 
MSFEMKLGWGKAVPIPPHPIYIPPSMMEHTLPPPPSGLPFNAQP j 
RERLKNPNAPMLPPPKNKEDPEKTLSQAIViCWIPTERNLLALl ^ 
nxui4.i::>i* v v«ii(j±^'Pii" ii^AMiWNREJtWWPMFRFIjFENQrPAHVYyRW 
KLYS Il^QGDSPTKWRTEDFRMFKNGSFWRPPPLNPYraHGMSEEQ 

eteafveepskkgalkbeqrdkleeilrgltprkndigdamvfc 

bNNAEAAEEI VDCITESIiS I IjKTPLPKKIARLYLVSDVLYNSSA 

kvanasyyrkffetklcqifsdlnatyrtiqghlqsenfkqrvm 

TCFRAS«:Dt4AIYPEPFLIKLQNIFI/3LVNIIEEKErEDVPDDLD 

gapieeeldgapledvdgipxdatpiddldgvpikslddduxsv 

PLnATEDSKKNEPIFKVAPSKWEAVDESELEAQAVTTSKWELFD 

qheeseeeenqnqeeesedeedtqsskseehhlysnpikeemte 
skfskysemseekraklreielkvmkfqdelesgkrpkkpgqsf 

QEQVEHYRDKLLQREKEKEbERERERDKKDKEKLESRSKDKKEK 

dectptrkerkrrhstspspsrsssgrrvkspspkserserser 

SHKESSRSRSSHKDSPRDVSKKAKRSPSGSRTPidtSRRSRSRSP 
KKSGKKSRSQSRSPHRSHKKSKGKT»fTGRKPFKKAVTYWKCDLF 
LCPER3VF 




1 


3190 

■ 

: 

] 
] 
] 

c 


Ci-i-RKLKMADKTPGGSQKASSKTKSSDVHSSGSSDAHMDASGPSD 
SDMPSRTRPKSPRKHNYRNESARESLCDSPHQNLSRPLLENKLK 
AFSIGKMSTAKRTliSKKEQEELKKKBDEKAAAEIYEEFriAAFEG 
SDGNKVKTPVRGGWNAAKEEHETDEKRGiaYKPSSRFADQKNP 
PNQSSNERPPSLLVIETKKPPLKKGEKEKKKSNLELFKEELKQI 
QEERDERHKTKGRLSRFEPPQSDSDGQRRSMDAPSRRNRSSGVL 
DDYAPGSHDVGDPSTT\NFYLGNI\NPQMNLKKCCCQSFGRFGP 
LASVKIMWPRTDEERARERNCGFVAPMNRRDAERALKNtJ»GKMI 
MSPEMKLGWGKAVPIPPHPIYIPPSMMEHTLPPPPSG-.PFNAQP 
RERLKNPWAPMLPPPKNKEDFEKTLSQAIVKWIPTERNLLALI 
URMIEFWREGPMFEAMIMNREINNPMPRFLFENQTPAHVyyRW 
ECLYSILQGDSPTKWRtEDFRMFKNGSFWRPPPLNPYLHGMSEEQ 
ETEAFVEEPSKKGALKEEQRDKIiEEILRGLTpRKNDIGDAMVPC 
[^NAEAAEEIVDCITESLSILKTPLPKKIARLyLVSDVI,YNSSA 
WANASYYRKFFETKLCQrFSDLNATYRTIQGHLQSEWFKQRVM^ 
PCFRAWEDWAIYPEPFIiIKLQNIPLGLVNIIBEKETEDVPDDI.d' 
^APIEEELDGAPLEDVDGIP IDATPIDDLDGVPIKSI^DDLDGV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
seqpjence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue o£ 
amino acid 
sequence 


Ama.no acid segment containing signal peptide 1 
(A=Alanine, C=Cysteine. D=Aspartic Acid E- 
Glutamic Aoid, F^Phenylalanine, G=Glycine 
H=Histidine, I=Isoleucine, K-Lysine, 
L=Leucine, M«Methionine, N=«A3paragine, 
P-Proline, Q-Glutamine. R=Arginine, 
S^Serine, T=Threonine, V=Valine, 
W^Tryptophan, y=.Tyrosine, X=Unknown, *«stop^ 
Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion) 








PLDATEDSKKNEP I FKVAP5 KWBAVPESELEAQAV'i'Ta KWEI,FD ' 

QHEESEEEENQWQEEESEDEEDTQSSKSEEHHEiYSNPIKEEMTE 

SKFSKYSEMSEEKRAKUREIELKVMKFQDELESGKRPKKPGQSF 

QEQVEHYRDKLLQREKEKELERERERDKKDKEKLESRSKDKKEK 

DECTPTRKERKRRHSTSPSPSRSSSGRRVKSPSPKSERSBRSER 

SHKESSRSRSSHKDSPRDVSKKAKRSPSGSRTPKRSRRSRSRSP 

KKSGKKSRSQS RSPHRSHKKSKGKTNTGRKFFKKAVTYWKCDIiF 

LCPERSVF 


593S 


3 


4493 


SYW^SGWRLSRPPRQFWAGWRGIGRFGTMAPVHGDDCEIGASAL 
- SDSGSFVSSRARREKKSKKGRQEALERLKKAKAGERYKYEVEDF 
TGVYEEVDEEQYSKLVQARQDDDWrVDDDGIGWEDGREIFDDD 
IiEDDAIjDADEKGKDGKARNKDKRimCKIAVTKPNNlKSM 
GKKTADKAVDIiSKDGLLGDILQDLNTETPQITPPPVMILKKKRS 
IGASPNPFSVHTATAVPSGKIASPVSRKEPPLTPVPLKRAEFAG 
DDVQVESTEEEQESGAMEFEDGDFDSPMEVEBVDLEPMAAKAWD 
KESEPAEEVKQSADSGKGTVSYLGSFI^PDVSCWDIDQEGDSSFS 
VQEVQVDSSHLPLVKGADEEQVFHFYWLDAYEDQVNQPGWPLP 
GKVWIBSAETHVSCCVMVKNIERTIiYFLPREMKrDI.NTGKETGT 
PISMKDVYEEFDEKIATKYKIMKFKSKPVEKNYAFEIPDVPEKS 
EYI.EVKYSAEMPQLPQDI.KGETFSHVFGTNTSSI*ELFLMKRKIK 
GPCWLEVKKSTALNQPVSWCKVEAMALKPDLVNVIKDVSPPPLV 
VMAFSMKT^K3NAK^mQNEI lAMAALVHHSFALDKAAPKPPFQSH 
FCVVSKPKDCIFPYAPKEVIEKKNVKVEVAATERTLLGFFIAICV 
HKIDPDIIVGHNIYGPELEVIiQRINVCKAPHWSKIGRLKRSNM 
PKLGGRSGFGERNATCGRMICDVEISAKELlRCKSYHIiSELVQQ 
IUCTERWIPMENIQNMYSESSQLLYIiLEHTWKDA\KFlJXJIMC 
EX*IVLPLALQITNIAGNIMSRTI«aGRSERMBFI,t.LHAFYElINY 
IVPDKQI FRKPQQKLGDEDEEXDGDTMKYKKGRKKGAYAffT vr 

DPKVGFyDKFILLLDFNSLYPSIIQEFNICFTTVQRVASEAQKV ' 

TEDGEQHQIPEl^PDPSLEMGILPREIRiOiVERRKQVKQLMKQQD 

UJPDLIKiYDIR'QKALKLTANSMyGCLGFSYSRFYAKPIJUVLVT 

YKGRElLMHTKEMVQfCMtlLEVIYGDTDSIMINTNSTNLEEVPKL 

GNKVKSEVNKLYKLLEIDIDGVFKSLLIiUCKKKYAALWBPTSD 

GNYVTKQELKGriDIVRRDWCDIiAKDTGNFVIGQILSDQSRDTIV 

BMIQKRLIEIGENVbNGSVPVSQFEINKALTKDPQDYPDKKSLP 

HVHVAI*WINSQGGRKVKAGDTVSYVI CQDGSNLTASQRAYAPEQ 

LQKQDWLTIDTQYYIAQQIHPWARICEPIDGIDAVIiIATGWEL 

Xdptqfkvhhyhkdeendaldggpaqltdeekyrdcerfkcpcp 
tcx5teniydnvfdgsgtdmepslvrcswiix:kaspi»tptvqi*sn 
klimdirrpikkyydgwliceeptcrnrtrhlplqfsrtoplcp 
acmkatlqpeysdksliytqlcfyryifdaecalbklttdhekdk 

LKKQFFTPKS^DYRKLKNTAEQFLSRSGYSEVNLSKIiFAGCAV 


5936 
5937 


1124 


139 


RGEEQFDAEFRRFACliGFGERliQEFSRIiLRAVHRSRAWTCYI*AI " 
RMLMATCCPSPTTTACTGPWQRAPPLRl^LVQKREADSSGLAPAS 
NSriQRRKKGLLLRPVAPLRTRPPLLISLPQDPRQVSSVIDVDrX 
tra X n«K V jcLifL^HO 4> JUKFIiGFi I RDGMS VRVAPQG \ LER VPG I PI 
SRt,VRGGIAESTGI.LAVSDEILEVtK3IEVAGKTLNQVTDMMVAN 
SHN\L1 VTVKPANQRNNWRGASGRLTGPPSAGPGPAE PDSDDD 
SSDLVIENRQPPSSMGLSQGPPCWDLHPGCRHPGTRSSIiPSIiDD 
QEQASSGWGSRIRGDGSGFSL 




31 


1600 

: 

< 


PTSLLKSTVUi/flCRliI^DKRYQCVysimEIFKVIiASFyVXLVIL 

YGLTSSYSLWWMLRSSUCQYSFEAUyEKSNYSDIPDVKNDFAFI 

tiHLADQYDPLYSKRPS I FLSEVSENKLKQINIOINEWTVEiaiKSK 

LVKNAQDKlECHLFMIiNGLPDNVPEDTEMEVI^SLELlPEVKLPS 

AVSQIiVNLKELRVYHSSLWDHPALAFI.EENIJ^aLRLKFTE^X^K 

IPRWVFHLKNLKELYLSGCVIiPEQLSTMQLEGFQDiaCNLRTLYL 

KSSLSRIPQVVmI,LPSLQKtlSLDNEGSKLVVLNNLKK^^mLKS 

DEI,rSa3LERIPHSIFSUWrJ{EIJ)LRENWI,KTVEEIlSFQHIia ' 

^TLSCLKLWHNNIAY I PAQIGALSNIiEQLSLDHNNIENLPLQLFD 

gTKt«HYLDI^YNHLTFrpEEIQYI«\5WLQYFAVTt^lBlMr.PI3G 
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ISDOCID: <WO^ ,., 0153312A1 1 > 



wo 01/53312 



PCT/USOO/34263 



SEQ 
ID 
NO: 


A'redicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ami.no acid segment containing signal peptide 
{A=Alanine, C:sCysteine, D=Aspartic Acid, E== 
Glutamic Acid, F=<Phenyl alanine, G^Glycine, 
H=Histidine, l«Isoleucine, K-Lysine, 
L^Leucine, M-Methionine, N=Asparagine, 
P=:Proline, Q=Glutamine, R=Arginine, 
Ss=Serine, T-Threonine, V^Valine, 
W^Tryptophan, Y-Tyroaine. X=.Unknown, *==stop 
Codon, /-possible nucleotide deletion, 
\=*possiblc nucleotide insertion) 








liFQCKKLQCL.LLGKNSXiMMLSPHVGELSNLTHRKPIG'^NYLETir" 
_ _PPE1,EGCQSLKRNCLIVEENLLNTLPLPVTERLC!TCLDKC ^ 


5938 
5935 


395 


1865 


YKGEGFFCMQEARGEkkKKKKAMSSPNIWSTCSSVYSTPVFSQK"" 

MTVWIl.LLl,SLyPGFTSQKSDDDYEDYASWKTWVLTPKVPEaDV 

TVILNNLLEGYDNKLRPDIGVKPTLIHTDMYVNSIGPVNAINME 

YTI0IFFAQTWYDRRI,KFNSTIKVLRLNSWMVGKXWIPDTFFRK 

SKKADAHWI'^TPNR^lLRIWNDGRVI,ySIJRLTIDAECQI^LHNFP 

MDBHSCPLEFSSYGYPREBIVYQWKRSflVEVGDTRSKRLYQFSP 

VGLRNTTE WKTTSGDyWMSVYFDLSRRMGYFTIQTY I PCTLI 

WLSWVSFWINKDAVPARTSLGITTVLTMTTLSTIARKS3UPKVS 

YVTAMDLFVSVCFIFVFSALVEYG\TLIIYFVSNRKPSKDKDECKK 

KNPAPTIDIRPRSATIQMNNATHLQERXJEBYGYECLDGKDCASF 

PCCFBDCRTGAWRHGRIHIRIAKMDSYARIFFPTAFCLFNI,VYW 


5940 


66 


14 04 


IRPGYI>KEVQENSPGHRAGLEPFFDFIVSINGSRX^KDt?DTLKD"" 

LLKAmmKPVTCMLIYSSKTLELRETSVTPSNLWGGQGI.LGVSIR 

FCSFDGANENVWHVLEVBSMSPAALAGLRPHSDYIIGADTVMNE 

SEDX^FSrilBTMEAKPLKLYVYNTDTDNCREVIITPNSAWGGEGS 

LGCGIGYGYLIIRIPTRPFEEGKKISLPGQMAGTPITP1.KDGFTE 

VQI^SVNPPSLSPPGTTGXEQSIiTGbSISSTPXPAVSSVIiSTGV 

PTVP\r.LPPQVNQSLTSVPPMESSyiaHLPGLMPFTRQGLPKI,PQ 

PSTFNLPRXPTHSMPGVGLYQEFVKPGVIiPPLSSMPPRNLPGM 

APLPLPSEFLPSFPIiVPESSSAASSGELLSSLPPTSNAPSnPAT 

ttakadaassltvdvtpptakapttvedrvgdstpvsekpvsaa 

VDANTASESP 


5941 


145 


717 


rrsasrsasprqsagtavutgtraogtclaaahhrmrwradgrs"" 

LEKLPVHMGLVITEVEQEPSFSDIASltWWCMAVGrSYlSVYDH 
QGIFKRNNSRLMDElUKQQQBLLGXaJCSKYSPEFAHSWDKDrjQV 
UlCHLAVKVI»SPEDGKADIVRAAQDFCQLVAQKQKRPTDr.DVDT *A 
LAWYLVOMWIiILI 




13 


6147 

: 

1 
I 
I 
I 

\ 


mci^rmgassprspepvgppapglpfcosgslLavvvllalpva •' 

WGQCNA?EM\;*PFARPTNLTDEFEPPIGTYLNYECRPGySGRPP 

SIICI,iaTSWTGAKDRCRRICSCRNPPDPVKK3WVHVIKaiQFGSQ 

IKYSCTKGYRLIGSSSATCIISGDTVIWDNETPICDRIPOGLPP 

TITNGDFISTNRENFHYGSWTYRCNPOSGGRKVFELVGEPSIY 

CTSNDDQVGIWSGPAPQCIIPNKCrPPKVENGILVSDNRSI*FSL 

NEWEFRCQPGFVMKGPRRVKCQALNKWEPEIiPSCSRVCQPPPD 

VI»HAERTQRDKDNFSPGQEVFYSCEPGYDI,RGAASMRCrPQGDW 

SPAAPTCEVKSCDDFMGQLLNGRVLFPVNLQIiGAJCTOFVCDEGP 

QLKGSSASYCVIAGMESLWNSSVPVCEQ X FCPSPPVI PJTGRHTG 

KPLEVPPPGKAVNYTCDPHPDRGTSFDLIGESTIRCTSDPQGNG 

WSSPAPRCGlLGKCQAPDHFLFAKLKTQTNASDFPIGTSliKYE 

CRPEYYGRPFS1TCI,DNLVWSSPKDVCKRKSC3CTPPDPVNGMVH 

VITDIQVGSRINYSCTTGHRLIGHSSAECriiSGKAAHWSTKPPI 

CQRXPCGLPPTIANGDFISTNRENFHYGSWTYRCNPGSGGRKV 

FEIiVGEPSIYCTSNDDQVGIWSGPAPQCirPMKCTPPNVENGIL 

VSDNRSLFSLNEWEFRCQPGFVMKGPPJIVKCQALNKWEPELPS 

CSRVCQPPPDVLHTVERTQRDKDNFSPGOEVPYSrEPnYnr.CffiaTi 

SMRCTPQGDWSPAAPTCBVKSCDDFMG0IiI.NGRVLPPVNU2r,GA 
KVDFVCDEGFQLKGSSASYCVIiAGMESr.WNSSVPVCEQlFCPSP 
PVIPNGRHTGKFLEVFPPGKAVNYTCDPHPDRGTSFDHGESTl 
RCTSDPQGNGVWSSPAPRCX3ILGHCX3APDHFLFAKLKTQTNASD 
PPIGTSLKYECRPEYYGRPFSITCUDNLVWSSPKDVCKRKSCKT 
PPDPVNGMVHVITDIQVGSRINYSCTTGHRLIGHSSAECILSGN 
FAHWSTKPPrCQRI PCGLPPTI ANGDFISTNR3NFOTGSWTYR 
:mX5SRGRKVFELVGEPSIYCTSNDDQVGIWSGPAPQCIIPNKC 
rPPNVENGII,VSDNRSLFSrjIEWEFRCQPGFVMKGPRRVKCQA 
JIKWEPBLPSCSRVCQPPPEILHGEHTPSHQDNFSPGQEVFYSC 
:PGYDLRGAASIiHCTPQGDWS PEAPRCAVKS CDDFLGQLPHGRV 
iFPLNLQLGAKVSFVCDEGFRX.K6SSVSHCVLVGMRSI,WNNSV]^ 
rCEHIFCPNPPAILNGRHTGTPSGDI PYGKE IS YTCDPHPDRGM 
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3NSDOCID: <W0 0153312A1_L> 



wo 01/53312 



PCT/liSOO/34263 



SEQ 
ID 
NO: 


Piredicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptT5e" 
<A=Alanine, C=Cysteine, D=Aspartic Acid, 
Glutamic Acid, F= Phenyl alanine, G^^Glycine, 
H==Histidine, I=Isoleucine, K:= Lysine, 
Lr^Leucine, M=Methionine, N=Asparagine, 
P-Proline, Q=Glut amine, R«Arginine, 
S ^Serine, T^^Threonine, V= Valine, 
W-Tryptophan, Y-Tyroaine, X=Unkno*m, *=StoCT 
Codon, /^possible nucleotide deletion, 
\=pos5ible nucleotide insertion) 




5342 






~xr»i.xui:;a-lJLKUrb'DPHGNGVWSSPAPRCKtjiVRAGHCKTPEQr 
PFASPriPlNDFEFPVGTSLNYECRPGyFGKMPSISCLBNl.vWs 
SVEDNCRRKSCGPPPEPFNGMVHIWTDTQFGSTVNYSCNEGPRL 
IGSPSTTCLVSGMNVTWDKKAPICEIISCEPPPTISNGDFYSNN 
RTSFHNGTWTYQCHTGPDGEQLFELVGERSIYCTSKDDQVGVW 
SSPPPRCISTWKCTAPEVEN3MRVPGNRSFFSLTEIIRFRCQPG 
FVPWGSHTVOCOTNGRWGPKtiPHCSRVCQPPPEIDHGEHTLSHQ 
DNFSPGQEVFYSCEPSYDLRGAASLHCTPQGDWSPEAPRCTVKS 
CDDFUSQLPHGRVLLPLNLQLGAKVSPVCDEGFRHJCGRSASHCV 
lAGMKALWKSSVPVCEQIFCPNPPAILNGRHTGTPLGDlPYGKE 
VSYTCDPHPDRGMTFNLIGBSTIRRTSEPHGNGVWSSPAPRCBL 
PVGAACPHPPKIQNGHYIGGHVSLYLPGMTISYTCDPGYLLVGK 
GFIFCTDQGIWSQUDHYGKEVNCSFPLPMNGISKELEMKKVYHY 
GDYVTLKCEDGYTIiEGSPWSQCXJADDRWDPPIAKCTSRTHDALI 
VGTLSGTIFFILLIIFLSWIILKHRKGNNAHENPKEVAIHLHSQ 
GGSSVHPRTIiQTNEENSRVLP 




5943 


4509 


688 


Xi.YVKMRANPl^yGiSHKAYQIDPPL\RKHREQ\l.Vll\VGRiar 

DKXAQMIRFEERTGYFSSTDLGRTASHYYIKYNTXETFNEbFDA 

HKTBGDIFAIVSKAEEFDQIKVREEEXEELDTLLSNFCEBSTPG 

GVENSYGKINILLQTYINRGEMDSFSLISDSAYVAQNAARIV:?A 

LFEIALRKRWPTMTyRLLNIiSKAlDKRLWGWASPIiRQFSIl,PPH 

MLTRLEEKKLTVDKLKDMRKDEIGHILHHVNIGLKVKQCVHQIP 

SVf4MEAFIQP iTRTVIiRVTLS I YADFTKNDQVHGTVGEPWWIWV 

EDPTNDHXYHSEYFLALKKQVISKEAQbLVFTIPIFEPLPSQYY 

IRAVSDRWLGAEAVCIIKFQHLILPERHPPHTELLDLQPX,PITA 

I^KAYETVLYNFSHPNPVQTQrPHTLYHTDasa/LIXSAPTGSGKT 

VAAELAI FRVFNKYPTSKAVYXAPLKALVRERMDDWKVRIEEicL 

GKKVIELTGDVTPDMKSIAKADLIVTTPEKWCGVSRSWQNRNYV 

QQVTILIIDEIHLLGEERGPVLEVIVSRTtJFISSHTEKPVRIVG 

LSTAXiANARDL^^WLWIKQMGLFNFRPSVRPVPIiEVHlQGFPGQ 

HYCPRMASMNKPAFQAIRSHSPAKPVLIFVSSRRQTRLTALELI 

A?IATEEDPKQWLNMDEREMENIXATVRDSNbKLTIAFGlGMHH 

AGLHERDRKTVEELFVWCKVQV:.IATSTLAWGVNFPAHLVI IKG 

TEYYDGKTRRYVDFPITDVLQMMGRAGRPQPDDQGKAVILVHDX 

KKDPYKKFLYEPFPVBSSLLGVLSDHLNAEIAGGTITSKQDALD 

YITWTYFFRRLIMNPSYYNLGDVSHDSVNKPLSHtilEKSLIELE 

LSYCXEIOEDNRSIEPLTYGRIASYYYLKHQTVKMFKDRLKPEC 

STEELLSIXiSDAEEYTDLPVRHWEDHMNSEIiAKCIiPXESNPHSF 

BSPHTKAHLLLQAHI^RAMLPCPDYDTDTKTVJLDQALRVCQA^IL 

DVAANQGWLVTVIiNITNLXQMVIQGRWLKDSSLLTLPKIENHHL 

HLFKKWKPIMKGPHARGRTSIECLPELIHACGGKDHVFSSMVBS 

ELHAAKTKQAWNFLSHLPEINVGISVKGSWDDLVEGHNELSVST 

L7ADKRDDNKW1KLHADQEYVLQVSLQRVHFGFHKGKPESCAVT 

PRPPKSKDEGWFULGEVDKRELIALKRVGYXRNHHVASLSPYT 

PEIPGRYIYTLYFMSDCyr/SLDQQYD/NI^QRYTSESFCTCQHQ 
Gh 


'-"V 




1 


2274 

■ J 
1 

I 

: 
I 
] 
h 


DKPTRHKTYI.SSSWAKMAAAEGPVGDGELWQ'TWLPNHVVFUII.R"~ 

KGr>KNQSPTEAEKPASSSLPSSPPPQLLTRHWFGLGGELPt.WD 

GEDSSFLVVRLRGPSGGG\EEPALSQYQRLLCINPPLFEIYQVL 

LSPTQHHVALIGIKGLMVLELPKRWGKNSEFEGGKSTVNCSTTP 

VAERFFTSSTSLTI,KHAAWYPSEILDPHWLLTSDNVIRIYSLR 

EPQTPTNVI IbSEAEEESLVLNKGRAYTASLGETAVAFDFGPIA 

WPKTbFGQNGKDEWAYPIiYILYENGETFIiTYISLLHSPGN/I 

^jkavcsiahasVaaednygydacavlclpcvpnilvxatesgml 

fHCWLEGEEEDDHTSEKSWDSRIDLIPSLYVFECVELELALKL 
^GEDDPFDSDFSCPVKLHRDPKCPSRYHCTHEAGVHSVGLTWl 
4KLHKFLGSDEEDKDSLQEIiSTBQKCFVEHILCTKPI,PCRQPAP 
ERGFWIVPDILGPTMICITSTYECLIWPLLSTVHPASPPLLCTR 
5DVEVAESPl,RVr*ABTPDSFEKHIRSILQRSVAKPAFItKASEKD 
lAPPPEECLQLLSRATQVFREQYIUCQDrAKEEIQRRVKLLCDd' 
aOCQLEDXiSYCREERKSLREMAERIADKYEEAKEKQEDIMNRMK 
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PCT/USOO/34263 



SEQ 
ID 
NO: 


pjredicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Atna.no acid segment containing signal peptide " 
(A=Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H~Histidine^ Is:tIsoleucine, K»Lysine, 
L=.Leucine, M=Methionine, W^Asparagine , 
P« Proline, Q«Glutamine, R=Arglnlne, 
SaSerine, T»sThreonine, V=Valine, 
WsTryptophan, Y=Tyrosine, X-Unknown, ♦^^Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








KLbHSFHSELPVLSDSERDM:<KELQLrPDQLRHI^NAIKQVTSiK~~ 

KDYQQQKMEKVLSLPKPTIIlUSAyQRKClQSILKBEGEHIREMV 

KQINDIRNHVNF 


S944 


167 


3428 


FSIATFTDBPEVLTEPPSATTTTTIGISATWTTIAGSHGKjy^lt 
ITTTSSKRKNRKNKITPEWVQIIFDDPLPISYSQPEKVNGESKS 
SSTSESGDSDNMRrSSCSDESSNSWSSRKSDNHSPAWTT^VSS 
KKQPSVLWPPKEERKSVSGKASIKLSETISEGTSNSLSTCTKS 
GPS PbS SPNGKLTVASPKRGQKREEGWKEWRRSKKVSVPSTVX 
SRVIGRGGCNINAIREFTGAHIDIDKQKDKTGDRIITIRGGTBS 
TRQATQLINALIKDPDKEIDELIPKNRLKSSSANSKIGSSAPTT 
TAANTSLMGIKMTTVALSSTSQTATAI.TVPAISSASTHKTIKMP 
VN\NVRPGFPVSFP\LAypppQFAHAIiIiAAQTFQQIRPPRr,PMT 
HFGGTFPPAQSTWGPFPVRPLSPJyRATWSPKPHMVPRHSNQMSS 
GSQVNSAGSIiTSSPTTTTSSSASTVPGTSTNGSPSSPSVRRQLF 
VTWKTSNATTTTVTTTASNHNTAPTWATyPMPTAKEHYPVSS^ 
SSPSPPAQPGGVSRNSPLDCGSASPNKVASSSEQEAGSPPWET 
TNTRPPNSSSSSGSSSAHSNQQQPPGSVSQEPRPPLQQSQVPPP 
EVRMTVPPLATSSAPVAVPSTAPVTYPMPQTPMGCPQPTPKMET 
PAIRPPPHGTTAPHKNSASVQNSSVAVLSVNHIKRPHSVPSSVQ 
IiPSTIiSTQSACQNSVHPANKPIAPNFSAPLPFGPFSTI.FENSPT 
SAHAFVJGGSWSSQSTPESMLSGKSSYLPNSDPLHQSDTSKAPG 
FRPPLQRPAPSPSGIVNMDSPYGSVTPSSTHLGNFASNISGGQM 
YGPGAPLGGAPAAANFNRQHPSPLSLLTPCSSASNDSSAQSVSS 
tjVKAfi>FAfc'£>SVPLGSEKPSNVSQDRKVPVPIGTERSARIRQTG 
TSAPSVIGSNLSTSVGHSGIWSPEGIGGNQDKVDWCNPGMGNPM 
IHRPMSDPGVPSQHQAMERDSTGIVTPSGTFHQHVPAGYMDFPK 
VGGMPFSVYQNAMIPPVAPIPDGAGGPIFNGPHAADPSWNSLIK 
MVSSSTENNGPQTVWTGPWAPHMNTSVHMNQLG 


5945 


1461 


197 


GVTHLFXiFGKRKLRNGIAEDLKGQADFFFLLVSEAWATGSPRA 
WLTCIiILPIjPGIIFSVLPKAMSRPLLITFTPATDPSDIiWKDGQQ 
QPQPEKPESTUX3AAARAPYEALIGDESSAPDSQRSQTEPARER 

krkkrrimkapaaeavaegasgrhgqgrsleaedkmthrilraa 

V*MN3'*»'*J«r»4jx^rfciJijOJC'flIlAljL5rt*JvlN JLWAKDAFWWTPIjMCTU'UiAGQG 

AAVSYl^IXSRGAAWVGVCBLSGRDAAQIiAEEAGFPEVARMVRESH 
GETRSPENRSPTPSLQYCENCDTHFQDSNHRTSTAKIiLSLSQGP 
QPPNLPLGVPISSPGFKLLIiRGGWEPGMGLGPRGEGRANPIPTV 
LKRDQEGLGYRSAPQPRVTHFPAWDTRAVAGRE\TPPRVATLSW 
REERRREEXKDRAWBRDIiRTYMNIiEF 


5946 


541 


1666 


II.GSYSSIQi'lSEYS\SWC\EWLQDLLA\yVSPK\HSYI.RDIiP 

segflpqrvns idfv\el\ehlqpdvi.vi-iavlrwdp/ti lteav 
ysyrgqkqkkvmltvboaqejqhyalvlwgpgaawVypolqrkkg 

YIWEFKYLFVQCNYTLEMLELHTTPWSSCECLFDDDIRAITFKA 
KFQKSAPSFVKrSDLATHLEDKCSGWIjIKAQISELAPPITASQ 
KIALNAHSSIiKS IFSSLPNI VYTGCAKCGLEI.ETDENRl YKQCF 
SCLPFTMKKIYYRPALMTAIDGRHDVCIRVESKLIEKILLNISA 
DCWJRVIVPSSErTYGMWADLFHSLIAVSAEPCVLKlQSLEVL' 
DENSYPLQQDFSLLDFYPDIVKHGANARL 


5947 


3 ■■ ~ ' 


1317 


RG I PDRRRRGP IGRVNMDliENKVKKMGLGHEQGFGAPCXKCKEK 

CEGFELHFWRKrCRNCXNVAKKSM/TVIiLSNEEDRKVGKLFaDX 

KYTTLr7VKLKSlX3IPMyKRNVMII,TNPVAAKKNVSIOT\^ 

PPVQNQALARQYMQMLPKEKQPVAGSEGAQYRKKQLAKQLPAHD 

QDPSKCHELSPREVKEMEQFVKKYKSEAIiGVGDVKLPCEMDAQG 

PKQMNIPGGDRSTPAAVGAMEDKSAEHKRTQYSCYCCKLSMKEG 

DPAIYAERAGYDKIiWHPACPVCSTCHELLVDMiyFWKNEKLYCG 

RHYCDSEKPRCAGCDELIFSNEYTQAENQNWHLKHFCCFDCDSI 

U«5E I YVrmiDKPVCKPCYVKNHAVVCQGCHNAIDPEVQRVTyN 

NFSWHASTECFLCSCCSKCLIGQKFMPVBGMVFCSVECKKRMS 


5948 


39 


3370 

1 


IfRERYPVSGGSVLRSALEVCWDFLSGLTEGSLLPEGFFSGPlDQ 
SNHYQMRRKGRCHRGSAARHPSSPCSVKHSPTRETLTYAQAQRM 
WSIEIEGRLHRIS IFDPLEI XLEDDLTAQEMSECa^SNKENSERt 
PVGLRTiOWIKNNRVKKKNEALPSAHGTPASASALPEPKVRIVEY 
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3NSDOCtD: <WO 0153312A1 J„> 



wo 01/53312 



PCT/USOO/34263 



SEQ 
ID 
MO; 



5949 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



39 



3370 



59S0 



"1166" 



373 



5951 



'"^"■n? ^'7^'^ segment con*L<ianxng signal ^iptlHT" 
(A=Alanine, C=cysteine, D=Aspartic Acid p 

H^Hi^tidi^'i rr-^^-^-inJ^Grotyc^^^ 
H-Hxscidme, l=Isoleucine, K-Lyaine 
L=ueucine, M=Methionine, N=Aspawgi;>e, 
P=Proline, Q-Glutamine, R-ArgininS, 
S=Serane, T=Threonine, V=VaHne 

codon /=possible nucleotide deletion 
\=possible nucleotid e insertion) 



TOGFGGARSEQEPGGGLGRKATPRRRCASESSIsS^S 
FNA(.KOGRGKPAI.VHRHTLEDRSELISCIEMGtreA^i^^ 

PGHHNGVTlPAPPli,VI.KIGEHMOTKSDEKLFL^^SwQ 



GMyQMRRKGROlRGSAARHPSSPCSVKHSPTRETiTYAO^™ 
VElBlEGRLHRIsiroPLElILEDDLT^iSSi^ 

SPPSAPRRPPVYYKFIEKSAEELDNEVETOMDEBDY^LEITOE 



5449 



RMC«3SRARPADCVI,CPNKGG,oKKTDDDRWGHV\V^WUP 

E\VGPANTVPIEPIDGVRKIppARWKLT\a«,CKEKGR7vaici 

QCHKANCyTAPHVTCaQKAGI.YMKMEPVKBLTCGGTTPSvSrA 

yCDVHTPPGCTRRPI«l«3DVEMKNGVCRKESSVKrvST^ 

Kp™KALAEPCAVI.PTVCIAPYIPPQRIJmi^VAIQ^ 

FVERAHsyWLLKRLSRKQAPLLRRLQSSLQSQRSSQQREN^TS 

KAAKBKI,laWQRIJ«u>i,ERARLLrELI^liE^KR£Q^ 

MEUa,TPI,TVIj:«SVI.DQ«3DKDPARIFAQPVSI.KEVPDYLnHI 

KHPMDFATMRKRLBAQGrKMLHEFEBDPDLriDNCMKYNWv 

FVRAAVRI,RDQGGWLROARREVDSIGLEEASSMHLPERPAAAP 

RRPFSWEDVDRriDPANRflH^/5LEEQLRELLD^JLm^TCwSsSG 

SRSia«KLLKKEIAl,UajKLSQQHSQPLPTGpS^S^ 

GPEAGEEVLPRr^TiaK3PRKRSRSTa3DSEVEBESPGKRL^ 

TNGFGGARSEQEPGGGLGRKATPRRRCASESSISS^SPr^S 

FWAPKOSRGKPALVRRHTLEDRSELrSClENGireAKAARi^V 

GQSSMWISTDAAASVLEPLKVWJAKCSGYPSYPALIIDPKMPRV 

PGHHNGVTIPAPPLDVI,KrGEHM<3TKSDEKI.PI.VI,PPDK^«Q 



TSKa^-lTte;TSQP UACJPC-WAASRPAlLyAI.LaSSU(AVPR:>RSR ' 
CLCROHRPVOLO^PHHTCREALDVIAKTVAFIJUai^™^^ 

DQm,LQGCV!GPi:,FLUJIAQDAVTFEVAEAPVPSILKKIl!LES 
SSSGGSGOLPDRPQPSLAAVQWLQCCLESFWSLELSPKEXYACL 
fS^mn^!T°^*^''^^"^EA«WVLCEVLEPWCPAAQGR 
LTRVH.TaSTLKSIPTSLLGDLFFRPIIGDVDIAGLLOnMT.^ 



. ^w^^ccitf J. x»jU VUiAGLX/3DMLt.LR 

WNVKPSLLWQLFKFSDKBE HKgjJDSlSGKTGETGVEEMIATR ir 
\^QDSKETVKLSHEDDHILEDAGSSDISSDAACTWPNKTENSnV 
GLPSCVDEVTECNLEI.KDTMGIADKTENTr.ERNKIEPr^YCEDA 
ESNRQLESTEFWKSNLEWDTSTFGPESNILENAICDVPDONSK 
QIJJAIESTKIESHETANLQDDRNSQSSSVSyLESKSVKSKHTlCRi 
VlHSKQNMTTDAPKKIVAAKyEVIHSKTKVWVKSVmmVPE? 
QQNFHRPVKVRKKQIDKEPKlQSCNSGVKSVKrrQAHSVLK K^ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
seqfuence 


Amino acid segment containing signal pepti(5i — 
(A=:Alanine. C^^Cysteine, D=A3partic Acid, E== 
Glutamic Acid, F=Phenylalantne, G-Glycine, ' 
HsHistidine, I«Isoleucine, K»Lysine, 
L^Leucine, M=Methionine, N-Aaparagine, 
P»Proline, Q«Glutamine, R==Arginine, 
S=Serine» T=Threonine, V=Valine, 
W=Tryptophan, Y^^Tyrosine, X=UnJcnown, .*«Stop 
Codon, /i=posslble nucleotide deletion, 
\=possible nucleotide insertion) 








dqtlvqlfkplthsl.sdkshahpgcjlkephhpaqtghvshseqk 
qchkpoqqapamktnshvkbelehpgvehfkeedklklkkpIkn 

LQPRQRRSSKSFSLDEPPLFIPDMIATIRREGSDHSSSFESKYM 

wtpskqcgfckkphgnrfmvgcgrcddwphgdcvglslsqaqqm 
geedkeyvcvkccaeedfckteildpdtiienqatvefhsgdktme 

CEKLGLSKHTTNDRTKYIDDTVKHKVKIIiKRESGEGRNSSDCRD 

ne i kicwqiapi,rkmgq?vlprrsseekseki pkesttvtctgek 
askpgthekqemkkkkvvekgvlnvhpaasaskpsadqirqsvr 
hslkdilmkrltdswlkvpeekaakvatkiekelfsffrdtdak 

YKNKYRSLMFNLIO^PraaNILPKKVLKGEVTPDHLIRMSPEELAS 

keuvawrrrenrhtiemiekeqreverrpitxithkgeieiesd 

APMKEQEAAMEIQEPAAKKSbEKPEGSEKVRKEEVDSMSKDTTS 

qhrqhlfdlnckxcigrmappvddlspkkvkvwgvarkhsdne 
aesiadat.sstsmiiiaseffeeekqespkst?spaprpempgtv 
evestflarujfiwkgfxnmpsvakfvtkaypvsgspeyiitedl 
pdsiqvggrispqtvwdyvekikasgtkeicwrftpvteedqi 
sytllfayfssrkrygvaannnkqvkdmyliplgatdkiphplv 

PFDGPGLELHRPNLLLGLIIRQKIiKRQHSACASTSHIAETPESA 

PPIALPPDKKSKIEVSTEEAPEEENDFPNSFTTVLHKQRNKPOQ 

NLQBDIiPTAVEPLMEVTKQEPPKPLRFliPGVLIGWENOPTTLEI, 

ANKPLP VDDII^SLLGTTGQVYDQ\AQS VMEQNTVKBI P FLNEQ 

TNSKIEKTDNVEVTDGENKEIKVKVDNISESTDKSAEIETSWG 

SSS XSAGSLTSIiS LRGKPPDVSTEAPLTNI^ IQSKQEETVESKE 

KTLKRQLQBDQENNLQDNQTSNSSPCRSNVGKGNIDGNVSCSEN 

LVANTARSPQFINLKRDPRQAAGRSQPVTTSESKDGDSCRNGEK 

HMLPGLSHNKEHLTEQINVEEKI.CSAEKNSCVQQSDNLKVAQNS 

PSVENlCyrSQAEQAKPI^EDILMQWIETVHPFRRGSAVATSHFE 

VGNTCPSEFPS KS ITFTSRSTSPRTSIWPSPMRPQQPNLQHUKS 

SPPGFPFPGPPNPPPQSMFGFPPHLPPPLLPPPGPGXFAVQNPM * * 

VPWPPW\HLP\GQPQRMMGPLSQASRYIGPQNFyQViCDZRRPE 

RRHSDPWGRQDQQQLDRPPNRGKGDRQRFYSDSHItDKRERHEKE 

WEQESERHRRRDRSQDKDRDRKSREEGHKDKERARLSHGDRGTD 

GKASRDSRNVDKKPDKPKSEDYEKDKEREKSKHREGBKDRDRYH 

KDRDHTDRTKSKR 


5952 


3226 


639 


PPARRSARDIiPRALSMEAARPSGSWNGALCRliVLVTIiXAFLIF 
ASDACKNVTLHVTSKIiDAEKLVGRVWIiKECFTAANLIHSSDPDF 
QILEDGSVYTIOTlLLSSEKIlSFTILIiSNTENQEKKKIFVPLEH 
C2TKVIjKKRHTKEKVI.RRAKRRWAP I pcsmlenslgpfplflqqv 
QSDTAQNYTIYYSIRGPGVDQEPRNLFYVERDTGNLYCTRPVDR 

YTFTX FENCRVGTTVGQVCATDKDEPDTMHTRLKYSI IGQVPPS 

PTLPSMHPrrGVITTTSSQLDRELIDKYQLKlKVQDMDGQYFGL 

QTTSTCI INIDDVNDHLPTFTRTS YVTS VBENTVDVEILRVTVB 

DKl^IjVin'ANWRANYTILKGKENGNFKTVTDAK'TWPrs^ 

NYEEKQQMILOIGVVNEAPFSREASPRSAMSTATVTVNVEDQDE 

GPECNPPIQTVRMKENAEVGTTSNGYKAYDPETRSSSGIRYKKL 

TDPTGWVTIDENTGSIKVFRSLDREAETIKNGIYXIXTVXASDQG 

GRTCTGTLGI ILQDVNDNS PF X PKKTVI ICKPTMSSAEIVAVDP 

DEPIHGPPFDFSLESSTSEVQRMWRLKAINDTAARLSYQKDPPF 

GSYWPITVRDRIKJMSSVTSLDVTLCDCXTENDCTHRVDPRIGG 

GGVQLGKWAIIiAILLGIALFFCIIjFTLVCGASGTSKQPKVIPDD 

IAQaNLIVSNTEAPGDDKVYSA3IGFTTQTVGASAQGVCGTVGSG 

IKNGGQETIEMVKGGHQTSESCRGAGHHHTIiDSCRGGHTEVDWC 

RYTYSEWHS FTQPRLGEES IRGHTL IKN 




S9S3 


330 


811 


PLLCNPDPGWYWWVKQESEISKESQEMDARPKItDLGFKEGOTIK 
LCIGNITNKKGGASKPRTARGGGLSIiLPPPPGGKVTXPPPSS /V 
KLPSTNHVTPPS XPKSNHGGSDADm[*DX*DSPAPVTTPAPTPVS 
VSNDLWGDFSTASSSVPNQAPQPSNWVQF 




5954 


32 


2130 


PPPPPPKLANMADLEAVIiADVSYLMAMBKSKATPAARASKRIVXi 
PEPSIRSVMQKYLAERNEXTFDXIFWQKrGPLLFiCDFCLNEINf 
WPQVKPYEEIKEYEKLDNEEDRLCRSRQIYDAYlMKELt^CSH 
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ID 

NO; 



595S 



Predicted ~~" 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end" 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Amxno acid segrnent containing signal peptide 
(A-Alanine, C-Cysteine, D=Aspartic Acid. E 



rV,/hI V ^' ^=-ftspartic Acid, ; 

u u-trfS '^''^^' F-Pl^enylalanine, G^Glycine 
H=Hi.stidine, I=lsoleucine, K=Lysine 
L=Leucina, M:=Methionine, N^Asparagine, 
P^Prolxne, Q=Glutamine, R==Arginine, 
S=Serine, T=Threonine, V^^Valine 
W..Tryptophan Y.Tyro3ine, X^UnkAown, *=sto« 
Codon, /^.poasible nucleotide deletion 
X^possible nucleotide insertion) 




1705 



S9S7 



1479 



5558 



139 



451 



ESDKFTRFCQWKNVELNIHLTMNEFSVHRIIGRGGFGEVYGdfeK 
VCMTYAFHTPDKLCFlLDLf^CKJDLHYHLSQHS 

CRDVSKRLGCHGGGSQEVKEHSFFKGVDWQHVYUJKVPPPLIPP 
RGEVNAADAFDrGSFDEBDTKGlKLUDCDQELyKNPPLVISERW 
QQEVTETVYEAVNADTDKIEARKRAIOJKQM^ 

ILSVEETQIKDKKCILPRIKGGKQFVLQCESDPEFVQMKKELNE 
TFKEAQRLLRRAPKFLtnCPRSGTVELPKPSLCHRMSWnT. 



PANRQDVI^GWIKIaPVLQLTKDPLKTPGRLDHGTRTAFIHHREQ 
VWKRClNIWRDVGIiPGVLNEIANSEBEVFEWXriCTASGWAIALCR 
WASSLHGSLFPHLSLRSKDLXAEPAQVTNWSSCCI^VPAWHPHT 
NKPAVAr^DSVRVYNASSTIVPSLKHRI^RNVA^WKP^^ 
VrAVACQSCIL-WrLDPTSLSTRPSSGCAivL^^TP^J^ 
^PSGGRI^SASPVDAAlRV»«>VSTBTCVPLPWFRGGGVim.LW 

RgQYRHQMVRRGLGERLT PWSGTPVGWVWLCI. 



GVGVRGAlMMATVQEKAjVA LNi^ALriSPAHkt>PGKt^VAQK]^^ 

IQim'FGDVDIPRAiCVVRVCQALNUJYKVFEAVPTKVBGKDKlCPT 
rfoffS^^^^^^^^^Q^^^^SPARYADALFKSSDrR 
SASLEDLWENLSLKPANSPHVNISATLSPQVXNEVWQEETIGRL 
^^®^^^^^^^^^^Q^^<^STMVNSSNYliDRGII,K 
AySDSQEDEWLSAArDCSEyi,PDQMW£ISRSFPEOPDRTDI.VK 
Erj^FDAIGRYYSSREPLLraLSDVHNGIAELLVNGKTEIALEAT 
QliLLKLl,DPQNREEPRRLLYFMAVAANPSEPKLQlCESONRMVVK 

Hjgjjg?J>EDSKLSAKEKKK\LLGQFYKCHPDIFlEHFGD 



EtgVAVAtTOlJ?RWKP KTiU<AKRFI.EKREFKI^IK^ 

SK^DCSl^FMFGSHNKKRPNNLVIGRMYDYHVLDMIEIiQIEiTPV 
SLKDIKWSKCPEGTKPMLIFAGDDFDVTEDYRRUCSLLIOFPRG 
PTVSNXRIJU3LEyVLHFTAr^GKIYFJ^YKrj:j[.KKSGCRTPRIE 
^[:?^v^^^^^™^°^^^^^MKMPKALKPKKiCKW 
3n?3r^^^^^^°^^"^^^\*^^^KKRPAERir3DHEKKS 
KRIKKKI^MBLSOPLLFHCVLIiKRIIKHOSIOSFn 



>*^m.LWFPAC0AKWXJ3VEK i.rvySGPXGSYFGrAVD^ 
ARTASVLVGAPKAWTSQPDIVEGGAVYYCPWPAEGSAQCRQIPP 
r^n^^^'^'^'^^^^^^^^Q"'^\ATVKA\HKGKSCGPVAP 
^i^S"^^^'^^^^^^^^*^^^QNFSAYAEFSPCGNSKADP 
EGQGYCQAGFSUDFYKKGDLIVGGPGSFYWQaQVITASV^ 
^f^"'^^°^^^^^^^^^°^YLGYSVAAGEFTGDSQQ 
EL^mGIPRGAQNFGYVS tINSYDMTFXQNFTGEQMASYFGYT\n^ 
VSDVNSDGLDDVLVGAPLFWEREFESNPREVGQIYLYLQVSSLL 
FRDPQILTGTETFGRFGSAMAHLGDLNQDGYNDIAIGVPFAGKD 
QRGKV^IYNGNKDGUmCPFPKFCQGVWASHAVPSGFGPTLRGD 
SDIDKNDYPDLri/GAPGTGKVAVYRARPVVTVDAQI,r,LKPMirw 
LENKTCQVPDSffTSAACFSLRVCASVTGQSrANTXVLMAEVQ^D 
SLKQKGAIKRTLFLDWHQAHRVFPLVtKRQKSHQCQDFIVYLRD 
EiEFRDKLSPUTISrJtJYSLDESTFKEGLEVKPILNYYRENIVSE 

qahilvdcgedni,cvpdijclsarpdkhoviig;>enhlmliikar 

NEGEGAYEAELFVMIPBEADYVGlERNNKGFRPLSCEYKMEtJVT 

Rmwcdi^npmvsgtnyslglrfavprlektkmsinfdlqirs/ 

NKDNPDSNFVSLQIKITAVAQVErRGVSHPPQIVI.PlHMWKPRPr 
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SEQ 
XD 
WO: 


Predicted 

beginning 

nucleotide 

location 

CO r re sp ond i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
c orre spondi ng 
to first 
amino acid 
residue of 
amino acid 


Amino acid segment containing signal peptide 
{A=Alanine, C^Cyateine, D=Aspartic Acid, 
Glutamic Acid, F» Phenyl alanine, Gt=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine» Mt=Methionine, N=Asparagine , 
P=Proline, Q«=Glutamine, RwArginine, 
S-Serine, T^Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X^Unknown, ♦«Stop, 
Codon, /«=possible nucleotide deletion, 
\=possible nucleotide insertion) 








EPHKEEEVGPLVEHIYELHNIGPSTISDTILEVGWPFSARDEpli 
LYIFHIQTLGPLQCQPNPNIKPQDIKPAASP2DTPEX,SAFLR»S 
TI PHLVRKRDVHWEFHRQSPAKI LMCTNI ECLQIS CAVGRLEG 
GBSAVLKVRSRLWAHTPLQRKNDPYALASLVSFEVKKMPVTDQP 
AKLPEGS I At KTS VI WATPNVS FS I PLWX IIAILLGLLVLAII, 
TLALWKCGFFDRARPPQEDMTDREQLTNDKTPEA 


S95S 


1 


1X66 


GTSGYAAQQLPSIjLKERJBFHLGTLNKVFASQHLNHRQVVCGTKC" 

ntiifwdvqtsqitkiprlkdrepggvtqqgcglhaiei*np5rt 
lilatggdnpnslaiyrlptldpvcvgddghkdwifslawisdtm 
avsgsrdgsmglwevtddvltksdarhnvsrvpvyahithkalk 
di pkedxnpdnckvraiiapnnknkelgavsldgyfhlwkaentl 
skllstklpycrenvciiaygsewsvyavgsqahvsfxidprqpsy 
nvksvcsrergsgirsvsfyehiitvgtgqgslupydiraqrft. 
eerlsacygskprlagenlklttgXkgwlnhdetwrnyfsdidf 
fpnavythcydssgtkiifvaggplpsglhgnyaglws 


5960 


2853 


870 


fvwsdggprprrgpavgagaaklsdpwamtpgtanratnplnke"" 

loijw/i^xwta t L-fcyuwtiUr JbOPPIiATRIJjAHKIQSPQEWEAIQALT 
VLETCMKSCGKRPHDEVGKPRFLNELIKWSPKYIiGSRTSEKVK 
NKILEIXYS WTVGLPEEVKIAEAYQMLKKQG\ XVKSDPKI*PDDT 
TFPLPPPRPKNVIFEDEEKSKMLARLliKSSHPEOLRAANKLIKE 
KVQEDQKRMEKISKRVNAIEEVNNNVKLI.TEMVMSHSQGGAAAG 
SSEDIj\MKEL\YQRCERMRPTLFPTGRVDTEDND\EAXiAErrjQA 
NDKLTQVINLYKQLVRGEEVNGDATAGSIPGSTSALLDIiSGLDL 
PPAGTTYPAMPTRPGEQASPEQPSASVSIiI£>DELMSLGI*SDPTP 
PSGPSLDGTGWNSFQSSDATEPPAPAIAQAPSMESRPPAQTSLP 
ASSGLDDLDLLGKTLLQQSLPPESQOVRWEKQQPrPRLTLRDLQ 
NKSSSCSSPSSSATSI.LHTVSPEPPRPPQQPVPTBI»SIASITVP 
LES I KPSNILPVTVYDQHGFRILPHFARDPLPGRSDVIiWWSM 
LSTAPQPIRNIVFQSAVPKVMKVKLQPPSGTELPAFNPIVHPSA ^ 

ITQVLIiLANPQKEKVRUlYKLTFTMGDQTYKEMGDVDQFPPPET 
WGSL 


5961 


198 


3147 


SGEPRPEPGNMATCIGEKIEDFKVGNLLGKGSFAGVYRAESIHT' 

GLEVAIKMIDKKAMYKAGMVQRVQNEVKIHCQLKHPSILELYNY 

PEDSNYVYLVLEMCHNGEMNRYLKNRVKPFSENEARHFMHQIIT 

GMLYLHSHGILHRDLTLSN3uLI*TRNMWIKIADPGI*ATQI,KMPHE 

KHYTLCGTPNyiSPEIATRSAHGLESDVWSl.GCMFyTLI.lGRPP 

FDTDWKNTrjtfiOTIiADYEfrfPa:TLSIEAKDLIHQr,LRRNPADRI. 

SLSSVLDHPFMSRNSSTKSKDLGTVEDSIDSGHATISTAITASS 

STSISGSLFDKRRLblGQPLPNKMTVFPKNKSSTDFSSSGDGNS 

FYTQWGNQBTSNSGRGRVIQDAEERPHSRYLRRAYSSDRSGTSN 

SQSQAKTYTMERCHSAE^^LSVSKRSGGGENEERYSPTDNNANIF 

KFFKEKTSSSSGSFERPDNNQALSNHLCPGKTPFPFADPTPQTE 

TVQQWFGNLQINAHLRKTTEYDSISPNllDFQGHPDLQKDTSiQIA 

WXDTKVKKNSDAS0NAHSVKQQNTMKYMTALHSKPEI1QQECVF 

GSDPLSEQSKTRGMSPPWGYQNRTr/RSITSPLVAHRLKPIRQKT 

KKAWS IliDSEEVCVELVKEYASQEYVKEVLQISSDGNTITIYY 

PNGG\RGFPIA\DRPPSPT\D^^:SR\YSP\DNl:JPEKrrfRKYQYA 

SRFVQLVRSKSPKITYFTRYAKCILMBNSPQADPEVWFYBGVKI 

HKrEDFIQVIEKTGKSYTl.KSESEVNSLKEEIKMYMDHAKBGHR 

ICLALES I rSEEERKTRSAPPFPIIIGRKPGSTSSPKALSPPPS 

VDSNYPTRDRASFNRMVMHSAASPTQAPIUfPSMVTNSGLGLVT 

TASGTDISSNSLKDCLPKSAQLIiKSVPVKNVGWATQ\LTSGAVW 

VQFNDGSQLWQAGVSSISYTSPNGQ\TTR\YGENEKIjPDYIKQ 

KLQCLSSILLMPSNPTPNFH 


5962 


20 


244 7 


RVCSSSASTASQAVMADAWEEIRRLAAtJFQRAQFAEATQRLSER 

ncieiwkliaqkqlewhtldgkeyitpaqiskemrdelhvrg 
grvni vdlqqvinvdlih lenrigdi iksekhvqlvlgqliden 

YuDRLAEEVNDKLQESGQVTISELCKTYDLPGNFLTQALTQRLG 
RI ISGHIDLDNRGVI FTEAFVARHKARIRGLFSAITRPTAVNSL . 
ISKYGFQEQULiYSVLEBLVKSGRIiRGTVVGGRQDKAVFVPDiydf 
RTQSTWVDSFFRQNGYLEFDAIiSRLGIPDAVSYIKKRYKTTQLI. 
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SEQ 
ID 
NO: 



S963 



5964 



Predicted 

beginning 

nucleotide 

location 

c or re sp ond i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 



Predicted end 
nucleotide 
location 
cor re spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 



1130 



2147 



5965- 



"59^ 



102 



1498 



Amino acid segment containing signal peptidT 
{A=Alanxne, C=Cysteine, D=Aspartic Acid, E=. 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H^HisUidine, I==IsoleuGine, K=Lysine, 
I,=Leucine, M-Methionine, W«Asparagine 
P«Proline, Q=Glutamine, R=Arginine, 
Ss'Serine, T^Threonine, V=:Valine 
W^Tryptophan, Y>=Tyrosine, X«Unkiown, *=Stop 
Codon, /^possible nucleotide deletion, 
\°pos Bible nucleotide inse rtion) 

FLKAACVGQGLVDQVEASVEEAISSGT WVDIAPXJjPTSLSVEDA " 

AILIiQQVMRAFSKQASTWFSDTVWSEKFNiNDCTELFREfcMH 

QKAEKSMKNNPVHLITEEDLKQISTLBSVSTSKKDKKDERRRKA 

TEGSGSMRGGGGGNAKEYKlKKVKKKGRiCDDDSDDESQSSHTGK 

KKPEXSFMFQDEIEDPLRKHIQDAPEEFISEiAEyi.IKPI^KTY 

LEWRSVFMSSTTSASGTGRKRTIKDLQEEVSWUYNNIRLFEKG 

MKFFADDTQAALTKHIiLKSVCTDITNLIFNPLASDLMMAVDDPA 

AITSErRKKILSKLSEETKVALTEOOHNSLNEKSIEDFISCLDSA 

AEACDXMVKRGDKKRERQri.FQHRQAi:AEQLKVTEDPALlLHi:,T 

SVLLFQFSTHSMLHAPGRCVPQIIAFr^SKIPEDQHALLVKYQG 

LVVKQLVSQSKKTGQGDYPLNNELDKSQEDVASTTRKEIX)ELSS 

SI KDIjVLK3RK5SVTEE 

PWNPQDFPGNRGLMG\QKGEIGPPVGQQGKKGAPGMP\GLMGaef 
GSPGQPGTPGSKGSKGEPGIOGMPGASGUCGEPGAIBSPGEPGY 
MGX.PGIQGKKGDKaNQGEKGIQGQKGENGRQGIPGQQGlQGHHG 
AKGERGEKGEPGVRGAIGSKGESGVDGLMGPAGPKGQPGDPGPQ 
GPPGriDGKPGREFSEQFIRQVCTDVIRAQIiPVt.IiQSGRIRNCl»I 
CLSQHGSPGI PGPPGPIGPEGPRGLPGIiPGRDGVPGLVGVPGRP 

6vrglkglpgrngekgsqgfgypgeqgppgppgpegppgiskeg 
ppgdpglpgkdgdhgkpgiqgqpgppgicdpsi/:psviarrdpp 

RKGPNY 



SCRTRGRC>SPLQPRBAGSSRGSRARSEPeRPGGM EEACQVQTfK 
RGDPHELRNIFLQYASTEVDGERYMTPEDFVQRYLGLYKDPNSN 
PKIVQLLAGVADQTKDGLISYQEFLAFESVIiCAPDSMFIVAFQL 
FDKSGNGEVTFENVKEIFGQTIIHHHIPFNWDCEFIRLHFGHNR 
KKHLNyTEFTQFI.QELQr.EHARQAPALKDKSKSGMISGLDFSDI 
KVTlRSHMLTPFVEEMLVSAAGGSISHQVSFSYFNAPNSLliNNM 
ELVRKIYSTLAGTRKDAEVTKEEFAQSAIRYGOATPIiEIDII.yQ 
LADLYKASGRriTXiADIERIAPLAEGALPYNLAELQRQQSPGLGR 
PIWLQIAESAYRFTLGSVAGAVGATAVYPIDLVKTRMQNQRGSG 
SVVGELMYKNSFDCFKKVIiRYEGFFGLYRGLIPQLIGVAPEKAr 
KLTVKDFVRDKFTRRDGSVPLPAEVLAGGCAGGSQVIFTNPLEI 
VjaRLQVAGBITTGPRVSAIJIVLRDLGIFGLYKGAKACFLRDIP 
FSAIYFPVYAHGKLLLADENGHVGGI^NIJLAAGAMAGVVPAASl.^ 
TPADVIKTRLQVAARAGQTTYSGVXDCFRKILXREEGPSAPMKG 

taarvfrsspqfgWtlvtyellqrgfyidftgglkpagseptpk 
sriadlppakpdhiggyrriatatpagierkfglylpkfkspsva 
wqpkaavaatq 

MVTWIjYRFXiPrSNWAAKIiRSI.LPPPLR LQFWLH/VRLQKCFLSRG 
CGSYCAGAKASPLPGKMAMGLMOGRRELliRLLQSGRRVHSVAGP 
SQtfI>GKPLTTRLIiFPAAPCCCRPHYLPIJUlSGPRSI*STSAISPA 
EVQVQAPPWAATPSPTAVPEVASGETADWQTAAEQSFAELGL 
GSYTOVGLIQMI^EFMHVDLGLPWWGAIAACTVFARCLIPPLIV 
TGQREAARIIlNHLPEIQKFSSRIREAKIiAQDHlEYYKASSBMAL 
YQXKHGIKLYKPLILPVTQAPIFISFFIALREMANLPVPSLQTC 
GLWWPQDLTVSDPIYILPLAVTATMWAVIaELGAETGVQSSDIiQW 
MRNVIRMMPLrTLPITMHFPTAVFMYWLSSNLFSLVQVSCLRlP 
AVRTVLKI PQRWHDLDKLPPREGPLES FKKGWKNAEMTRQLRE 
REQRMRNQLELAARGPLRQTFTHNPLLQPGKDNPPNIPSSVSSS 
SSKPKSKYPWHDTIiG 

RSKQVMARLTKRRQADTKAIQHLWAAIEIIRNQKQIANXDRITK 
YMSRVHGMHPKETTRQLSLAVKDGLIVETLTVGCKGSKAGIEQR 
GYWLPGDEIDWETENHDWYCFECHLPGEVI.ICDLCFRVYHSKCL 
5DEFRLRDSSSPWQCPVCRSIKKKNTNKQEMGTYLRFIVSRMKE 

raidi,nkkgki)nkhpmyrrlvhsavdvptiqekvnegk:yrsyee 
fkmaqlli*hntvifygadseqadiarmlyfadtchel\delqlc 

KNCFYLANARPDNWFCYPCIPNHELDWAKMKGFGFWPAKVMQKE 

dnqvdvrffghhhqrawipseniqditvnihrlhvkrsmgwkka 

a)EI*ELHQRPLREGRFWK^KNEDRGEEE:ABSSISSTSNEQLKW 
QEPRAKKGRRNQSVEPKKEEPEPETEAVSSSQEIPTMPQPIElS^ 
SVSTQTKKLSAS5PRMLHRSTQTTNDGVCQSWCHDKYXKI FMDF 
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SBQ 
ID 
NO; 



Predicted ~ 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue o£ 
amino acid 
sequence 



S9G1 



102 



Fredicted-SndT 

nucleotide 

location 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



59G8 



1925 



Amino acxa segment conta^n^ng signal peptide 
(A=Alanine, C:.Cysteine, D^Aspartic Acid, 
Glutamic Acid, F-Phenyl alanine, G=Glycine. 
H=.Histida.ne, I-Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=.Asparagine . 
P'=Prolane, Q^Glut amine. R-Arginine, 
S=^Serine, T=Threonine, V^Valine 
W=Tryptophan, Y=Tyrosine, X^^UnkAown, *«stot> 
Codon, /-possible nucleotide deletion, 
\=possable nucleotide insertion) 
iojrmksdhkretkkvvrealeklrsemee eK^^^ 

EMDRKCKQVKEKCKEEPVEKTICVr.Ji'rr.xx.r^r;™"™ 



5969 



1X2 6"" 



316 



503 



4712 



BEEAMYHCCWNrSYCSIKCQ QEHHHAFiHtrPTPDovp " . ^ 
RSKQVMARLTKRRQADTKAIOmi WAAIEIIRtJUKOlANIDRITK 

GYWLPGDElDWETENHDWYCFECHLPGBVLtCDLCFRVYHSKCL 
SDEFRLRDSSSE>WQCPVCRSIKKKNTNKQEMGTYI>RF1VSRMKE 
^fn^f^^'^^^-'^^'^^^^V^'^^QEKVNEGKyRSYEB 
FKADAQLrjCiHNTVIFYGADSEQAOrARMLYKDTCHELXDELOLC 
KNCPYLANARPDNWFCYPCIPNHELDWAKMKGKSFWPAKVMQKE 
DNQVDVRFFGHHHQRAWIPSE^TIQDITVNZHRLHVKRS^rcWKKA 
CDEI,ELHQRFLREGRPWKSKNEDRGEEEAESSISSTSNEOLKVT 
QEPRAKKGRRWOSVEPKKEEPEPETEAVSSSQEIPIWPQPIEKV 
SVSTQTKKLSASSPRMUIRSTQTTNDGVCQSMCHDKYTKIFNDF 

^mksdhkrbtervvreai,eki>rsemeeekrSSvI^ 

EMDJUCCKQVKEKCKEEFVBEIKKZATQHKOLISQTKKKQW^ 

eeeamyhccwntsycsikcqqehwh aehkrtcrrkr 

VKyPRRGGAPPTVl.TPGRQQ <jyyi?U?PQRPGSBPDIPARGQPHPP " 



v^^^t^x^^uMir-i^ivjjTPGRQQGVFLGPQR PG^BPDIPARGQPHPP 
RPVGVSTSAQAQVQPPAMHRRRIArx;i^PCLIJ«3TSLSv£ 

LESmPTOLLTLTPWW^IVSBGTFNPELI^HIYQPr^TlGVW 
f««?^v"^^^^^^^^^^^I^E)NPAAVPGVPLGPHRL 
iffff/^°^""^^'^^"^^^^Q«^AKRAKREVDYLFCLDVD 
^^^^^^^''^'^^"^^^^V^^QQPPYE^^RVSTAFVA 
2!f??^^^*'^Q^^F^GCHMAILADKAMGIt«AWR 
^^^^"P'^SNKPSECVLSPBYtrfTODRKPQPPSLKI.IRFSTLDK 

T^GEPPS/TTSQKVKEAGRDFTYLIVVLroiSITGGLFYTI 
^^^^ff^^'^'^'^'^^^^^^^^^V^^^^SVKGYGEV^ 
G^QHVRFTEYVKDGliKHTCVKFnEGSEPGKQGTVYAQVKENP 
GSGEYDPRYIFVEIESYPRRrxIIEDKRSQDD 

RMEMh.UjyAEDATERRRVLEVEKEDTEELRQKYKDyVDKEKAIA 
KAI^EOLRANFYCELa&KQYQKHQBFramiMSYDHAHKQRI^L^ 
QREFARNVSSRSRKDEKKOEKALRRLHEIAEQRKQAECAPGSGP 
MFKPTTVAVDEEGGEDDKDESATNSQTGATASCGLGSEFSTDKG 
GPPTAVQITNTTGtAQAPGLASQGrSEXSIKNNLGTPLQIOiGVSF 
SFAKKAPVKLESIASVFKDHAEEGTSEDGTKPDEKSSDQGLQKV 
GDSDGSSNLDGKKEDEDPQDGGSLASTLSKLiCRMKREEGAGATE 
PEYYHYIPPAHCKVKPNFPPLLPMRASEQMDGDNTTHPKWAPES 
^^pf^^i^fiS^^^^^^'^^^^^^^^^^J^J^TSMTEPSEPGS 
KAEAKKALGGDVSDQSLESHSQKVSBTQMCESNSSKBTSLATPA 

GKES0EGPKHPTGPFFPVLSKDESTAr<3WPSELLIFTKAEPSIS 
YSCNPLYFDFKLSRNKDARTKGTF:TrPKT»Trteee«.r.rrr «^ — 



^..^r.cv^i.,txVRSSGGRMDAPASGSACSGia!IKQEPGGSHGSE 

TEDTGRSLPSKKERSGKSHRHKKKKKHKKSSKHKRKHKAI>TEEK 

SSKAESGEKSKKRKKRKRKKPJKSSAPADSBRGPKPEPPGSGSPA 

PPRRRRRAQDDSQRRSLPAEEGSSGKKDEGGGGSSSQDHGGRKH 

KGELPPSSCQRRAGTKRSSRSSHRSQPSSGDEDSDDASS^^ 

KSPSCYSEEEEEEDSGSEHSRSRSRSGRRHSSHRSSRRSYSSSS 

DASSDQSCYSRQRSYSDDSYSDYS0RSRRH5KRSHDSDDSDYAS 

SmSKRHKYSSSDDDYSLSCSQSRSRSRSHTRERSRSRGRSRS 

SSCSRSRSKRRSRSTTAHSWQRSRSYSRDRSRSTRSPSQRSGSR 

^WGHESPEERHSGRRDPIRSKIYRSQSPHYFUSGRGEGPGKK 

DDoRGDDSKATGPPSQNSNIGTGRGSEGDCSPBDKMSVTAKLDI, 

EKIQSRKVERKPSVSEEVQATPNKAGPKLKDPPQGYFGPKLPPS 

LGNKPVLPLIGKLPATRKPNKKCEESGLSRGEBQEQSETEEGP^ 

GSStlALPGHOFP\SEETTGPLI.DPPPEESKSGEVTADHPVAPLG 

PPAHFDCYUSDPTtSHNyLPDPSDGNTLESLDSSSQPGPVESSIi^ 

LPIAPDLEHFPSYAPPSGDPSIESYDGAEDAXSIiAPlaHSQPITP 
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SEQ 
ID 
NO; 


Predicted 
beginning 
nucleotide 
location 
CO r re spond i 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Aniino acid segment containinq sianal DeDtTn^ 
(A=Alanine, C=Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H^^Histidine, I^Isoleucine, K=»Lysine, 
Ei=Leucine, M-Methionine, N=Asparagine, 
P=Proline, Q«Glutamine, Rs=Arginine, 
S=Serine, T«Threonine, V=Valine, 
tf^Tryptophan, y=Tyrosine, X-Unknown, *=Stop 
Codon, /s:possible nucleotide deletion, 
\=possible nucleotide insertion) 








rPEEMEKYSKI;CX2AACX?HJQQQIJAKQVKAFPASAAIAPATPA-£r 

QPIHIQQPATASATSITTVQHAILQHHAAAAAAAIGIHPHE^PQ 

PIAQVHHrPQPHLTPISLSHLTHSIIPGHPATPLASHPIHIIPA 

SAIHPGPFTFHPVPHAALYPTLLAPRPAAAAATALHLHPLLHPI 
FSGQDLQHPPSHGT 


5971 


53 


2149 


SFLYFVGVuMDNPlGNWDGRFDGVQLCSFACVESTILIiHINDII^ 

PESVTQERRPPKLAFMSRGVGDKGSSSHNKPKATGSTSDPGNRN 

RSSLFYTLNGSSVDSQPQSKSKNTWYlDEVABDPAKSIiTEISTD 

FDRSSPPLQPPPVNSLTTENRFHSLPFSLTKMPNTWGSIGHSPL 

SLSAQSVMEELNTAPVQBSPPIiAMPPGWSHGLEVGSLAEVKENP 

PFVGVIRW IGQPPGLNEVLAGLEIjEDECAG\CTDGTP/REGTRY 

FTCALKKALFVKLKSCRPDSRFASLQPVSNQIEROISLAIWEAY 

lseweentptqkwekegleimig\kkkgiqghynscyldstlf 
clfafssvldtvllrpkekndveyysetqeiilrteivnpiiriyg 
yvcatkimklrkiiiekveaasgftseekdpeefiinilfhhilrv 

BPHiKIRSAGQKVQDCYFYQIFMEKNEKVGVPTIQQULEWSFIN 
SNLKPAEAPSCLIIQMPRFGKDFKLFKKIFPSLELNITDLLEDT 
PRQCRICGGLAMYECRECYDDPDJSAGKIKOFOKTrMTnuwT «d 

KRLWHKYUPVSIiPKDLPDWDWRHGCXPCQNMELFAVLCIETSHy 
VAFVKYGKDDSAWIiFFDSMADRDGGQNGFWIPQVTPCPEVGEYI. 
KMSLEDLHSLDSRRIQGCARRLLCDAIYVPCTQSPTMSI.YK 


5972 


44 0 


1761 


ILLAGSPSPRDQCSQRQSSGGDKEIiVTRGCTFSTAWSPSAMTQ 

EPFREELAYDRMPTLERGRQDPASYAPDAKPSDLQltSKRLPPCF 

SHKTWVFSVLMGSCLLVTSGFSLYLGNVFPAEMDYLRCAAGSCI 

PSAIVSFTVSRRNANVlPNFQIliFVSTFAVTTTCLIWFGCKLVI. 

NPSAININFNLIbLLLLEI>Lr4AATVIlAARSSEEDCKKKICGSMS 

DSANILDEVPFPARVLKSYSWEVIAGISAVLGGI lALNVDDSV 

SGPHI>SVTFFMriLVACPPSAlASHVAAECPNKCLVBVI,IAISSI* 

TSPI.LFTASGYLSFSIMRIVEMFKDY?PAIKPSyDVLLLI,LLt.V ^ v 

WiLQA/GPQHGHRHPVRALQGQCKAAGCILGHPERPAGAPGWGG 

GQBPPEGVRQGESr*ESRRGANGPVTPRRGNRVAAPSLAPGMETH 
NP 


5973 


65 


. 2007 


NGDGKDLFGHIWAWRSNGIISMPRRSPHAGMAEDEPDAKSPKTG 
GRAPPGGAEAGEPTTIJLQRLRGTISKAVCSJIKVEGILQDVQKPSD 
NDKLYI>YLQLPSGPTTGpKSSEPSTLSNEEYMyAyRWIRNHI.BB 
HIDTCLPKQSVYDAYRKYCESLACCRPLSTANFGKIIREIFPDI 
KARRI,GGRGQSKyCYSGIRRIcrLVSMPPLPGI.Dl4KGSESPEMGP 
EVTPAPRDELVEAAOybTCDWAERILKRSFSSIVEVARFttLQQH 
LISARSAHAHVLKAMGloAEEDEHAPRERSSKPKNGIiPJNPEGGAH 
KKPERIiAQPPKDLEARTGAGPtARGKRKKSWESSAPGANNLQV 
NAIiVARLPLLIiPRAPRSLl PP I PVSPPIIiAPRLSSGALKVATIiP 
LSSRAGAPPAAVPI XNMILPTVPALPGPGPGPGRAPPGGLTQPR 
GTEWREVGIGGDQGPHDKGVKRTAEVPVSBASGQAPPAKAAKQD 
lEDTASDAKRKRGRPLKKSGGSGERNSTPLKSAAAMESAQSSRIi 

pwetwgsggegnsaggaerpgpmgeaekgavlaqgXqgdgtvsk 

GGRGPGSQHTKEABDKIPt.VPSKVSVIKGSRSQKEAFPIAKGBV 
DTAPQGNKDLKEHVLQSSIjSQEHKDPKATPP 




5974 


4293 


2200 

I 
1 


lAjuyi'fitt 1 j.£>LrK±Hv/ii»fiv i':iJaWi!;iJNESVTVEWIENGDTKGK\EID 

lesifslnp\dl\vpdgeiepsp\etppppassakvnkivknrr 

TV\ASIKNDPPS\RDNRWGSARARPSQFPEQFSSAQQNGSV\S 
DISPVQAAKKEPGPPSRRKSNCVKEVEKLQEKREKRRLQQQELR 

ekraqdvdatnpnyeimcmirdfrgsldyrplttadpidehric 

VCVRKRPLNKKETQMKDU3VITI PSIODVVtWHEPKQKVDLTRYI, 

enqtfrfdyafddsapnemvyrftarplvetifergmatcfayg 
qtgsgkthtmggdfsgknqdcskgiyalaardvfmlkkpnykk 

LELQVYATPFEIYSGiCVFDIJJ^KTKLRVIiEDGKCSQVQVVGXiQE 

revkcvedvlklidigkscrtsgqtsanahssrshavfoiii^r 
kgklhgkfslxdlagnergadtssadrqtrlegaeinksiilaiik 
ecirat/srnkphtppraskltqviirdsfigensrtcmiatispfi " 
^centl^nryanrvksltvdptaagdvrpimhhppnolxdb 
[.etqwgvgss pqrddlklz,ceqneeevspqi,ftfheavsqmvem 
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SEQ 
ID 

m: 


Predicted 
beginning 
nucleotide 
locaticn 
corresponding 
to firBt 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aapartic Acid, 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
H»HiQ t idine , I»Iaoleucine , K-Lysine , 
L«Leucine, M=Methionine, N=:Asparagine, 
P=Prcline, Q=Glutamine, R=Arginlne, 
S=Serine, T=Threonine, V=Valine, 
W-Tryptophan, Y«Tyrosine, X=Unknown, *=:Stop 
Codon, /=possible nucleotide deletion, 
\.^possible nucleotide insertion) 








ESQWBDHRAVPQESIKWLEDEKALLEMTEEVDYDVbSYATQLE 
AILEQKIDIUTELRDKVKSFRAALQEEBQASKQINPKRPRAir 


597S 


4293 


2200 


LGLQMHTTSGRIHQAMVTSIiHEDNESVTVEWIENGDTKGK\EID"" 
LESIFSLNP\DL\VPD3EIEPSP\ETPPPPAS<37iin/M»-Ti7»'MT>» 

TV\ASIKNDPPS\Rr)NRWGSARARPSQFPEQFSSAQQNGSV\S 
0ISPVQAAKKEFGPPSRRKSNCVKEVEKXQEKREKRELQQQELR 
EKRAQDVDATNPNYEIMCMIRDFRGSLDYRPLTTADPIDEHRIC 
VCVRKRPI^mOCBTQMKDriDVir I PSKDVVMVHEPKQKVDLTRYL 
ENQTFRFDYAFTJDSAPNEMVyRFTARPIiVETIFBRGMATCFAYG 
QTG3GKTHTMGGDFSGKNQDCSKGIYALAARDVFLMLKKPNYKK 
LEXiQVyATFFEIYSGiCVFni,T,NRKTKLRVLEDGKQQVQVVGLOE 
REVKCVEDVLIOilDIGNSCRTSGQTSANAHSSRSKAVFQIILRR 
KGKLHGKFSLIDLAGNSRGADTSSADRQTRLBGAElWKSIiIiAIiK 
ECIRALGRNKPHTPFRASKLTQVLRDSFIGENSRTCMIATISPG 
MASCENTI»NTIiRYANRVKELTVDPTAAGDVRPIMHHPPNQI\DD 
LETQWGVQSSPQRDDLKLLCEQNEEEVSPQIiFTFHEAVSQMVEM 
EEQVVEDHRAVFQBS1RWLEDBKALLEMTBB1/DYDVDSYA1X3LE 
AILEQKIDILTEIjRDKVKSFRAALQEEEQASKQINPKRPRAL 


5976 


20 


2949 


VHHbHLTRVSVVWLDIILRIAQQMGlKTI^VIX?\l,KRA\LEF 
PEVSWMEVKDPNMKGAr^LTNTGKYAiPTIDA\EAYAIGKKEKPP 
PLPEEPSSSSEEDDPIPDBLLCLICKDIMTDAWIPCCGNSYCD 
BCIRTALLESDEHTCPTCHQNDVSPDALIAilKFLRQAVNNFKNE 
TGYTKRUlKQt.PSPPPPIPPPRPLIQRNLQPLMRSPISRQQDPL 
MIPVTSSSTHPAPSISSLTSNQSSIAPPVSGNPSSAPAPVPDIT 
ATVS ISVHSBKSDGPFRDSDNKILPAAAIiASEHSKGTSS lAITA 
LMEEKGYQVPVLGTPSLLGQSLXiHGQLXPTTGPVRrNTARPGGG 

RPGWEHSNKLGYIjVS P POOTREfiPP QPTTO Q TKn>r»DUTJC?-Ctrt Pr\n*n 

CGPSLPATPVFVPVPPPPXiYPPPPHTLPLPPGVPPPQPSPQFPP 
GQP\PPAGYSVPPPGFPPAPAKLSTPWVSSGVQTAHSNTIPTTQ ^ 
APPLSREEFYREQRRLKEEEKKKSKLDBFTNDFAKELMEYKfaQ 

kerrrsfsrskspysgssysrssytysksrsgstrsrsysrsfs 

RSHSRS YSRS PPy PRRGRGKSRNYRSRSRSHGYHRSRSRS PPYR 

ryhsrsrspqafrgqspnkrnvpqgetereyfnryrevpppydm 
kayygrsvdfrdpfekeryrewerkyrewyekyykgyaagaqpr 
psamrenfsperfr.plnirmspftrgrrbdyvggqshrsrnigs 

kgeesegplkpelletsrksreptgvebmktdsijfvripsrddat 

PVRDEPMDAES ITFKSVSEKDKRERDKPKAKGDKTKRKNDGSAV 
SKKBNIVKPAKGPQEKVrK3\DVia)X»LDLNI,\QI*KXPKEETFia^ 
TILNHHLPLRRMKKSLS EPP\ BKLTIiWQQKXTPRNKTSQRGKSE 
EGLFQRCQIRKANN 


S977 


1363 


1336 


FLEDRGQVJ^jSHFQCLSLHS inhi lhpgagvaagpaTgw/reylt 
pvlkeskfketgvitpeefvaagdhlvhhcptwqwatgeelkvk 
aylptgkqplvtkwvpcykrckqmeysdeleaiieeddgdggwv 
dtyhntgiltsiteavkeltlenkdnirlqrcsalcebeededeg 

EAADMEEYBESGLLETDBATLDTRKIVEACKAKTDAGGEDAlIiQ 

trtydlyitydkyyqtprlwlfgydeqrqpltvehmyedisqdh 
vkktvtienhphlppppmcsvhpc3u1aevmkki letvaegggel 

GVHMYIiLIFLKFVQAVIPTIEYDYTRHFTM 


S978 


160 


3213 


RDGARRWGGCQSPLTWAPGFYRRFDLATSGRRLRGQTAEPAGRQ 
RPRREPEAKDEQSVESIAEVFRCFICMEKIiRDARLCPHCSKLCC 
FSCIRRWLTEQRAQCPHCRAPLQLRELVNCRWAEEVTQQLDTI^ 
LCSi:,TKHEBNEKDKCBNHHBKLSVFCWTCKKCICHQCAl,WGGMK 
GGHTFKPIiAEI YEQHVTKVNEEVAKLRRRLMELISLVQE VERNV 
EAVRHAKDERVRE 1 RNAVEMM I ARIiDTQLKNKLITIiMGQKTSLT 
QETELLESLLQEVEHQLRSCSKSBLISKSSEILMMFQQVHRKPM 
ASFVTTPVPPDFTS ELVPSYDSATFVLENFSrLRQRADP VYSPP 
LQVSGLCWRLKWPDGNGWRG YYLS VFLELSAGLPETS KYEYR 
VEMVHQSCNDPTKKI IREFASDFEVGECWGYNRFFRLDLIANBG 
yLN^ONDTVIIjRPQVRSPTFFQKSRDQHWYITQLEAAQTSYIQfe 
INNliKERLTrEIiSRTQKSRDLSPPDNHLSPQKDDALETRAKKSA 
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SEQ 
ID 
NO: 



"Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



5979 



212 



3665 



5980 



23^3 



5981 



2S19 



Am:Lno acid segment containing signal peptide 
(A.Alanine, C^Cysteine, D=Aspartic Acid E= 
Glutamic Acid, F=.Phenylalanine, G-GlyciAe 
H=Histidine, l = lGoleucine, K^Lysine/ 
L=Leucine, M=Methionine, N=Asparaginc , 
P-Proline, Q=<31utaminc, R«Arginlne, 
S-Serine, T^Threonine, V= Valine, 
W^Tryptophan, y*.Tyrosine, X=UnkAown, *=stOD 
Codon, /^possible nucleotide deletion, 
\*=possible nucleotide insertion) 

DLDLVYEDEVNQLDOSSSSASSTATSNTEENDIDEEXMSGENBV 

EyWWMELEEGELMEDAAAAGPAGSSHGYVGSSSRISRRTHLCSA 

ATSSLLDIDPLILIHLLDLKDKSSrENLWGLQPRPPASliLQPTA 

SYSRia)KDQRKQQiF^^RVPSDr,KMLKiaJKTQMAE\mCMKTDV^ 

TLSEIKSSSAASGDMQTSLFSADQAALAACGTENSGRLODLGME 

LLAKSSVANCYIRNSTNKKSNSPKPARSSVAGSLSLR^VDPGE 

NSRSKGDCQTX^SEGSPGSSQSGSRHSSPRALIHGSIGDILPKTB 

DRQCKALDSDAVWAVPSGLPAVEKRRKM\nfI/5ANAKGGHI.EGL 

QMTDLEtWSETGELQPVLPEGASAAPEEGHSSDSDIEODTENEE 

QEEHTSVGGFHDSFMVMTQPPDEDTHSSFPDGEQIGPBDLSFWr 
DEMSGR 



L.PDMTMyi.Wi.y^lAFGFAFLDTE VFVTGQSPTt>SPTDAYLNASE 
TTTLSPSGSAVISTTTIATTPSKPTCDEKYANITVDYL^CETK 
LFTAKL^n/NENVEOGNNTCTNNEVHNLTECiCNASVSISHNSCTA 
PDKTLILDVPPGVEKVPVHCCSVQVEQPDSTIWLKMKNIETSTC 
DTQNITyRPQCGNMIPDNKEIKr,BNUSPEHEYKCDSElLYNSHK 

FTNAS KI I KTDPGSPGEPO T T PrT> oh** nu^t r-r 



* ^*^Ai>i^^u^ijL»iuj4L,iKyDIiQNLKPYTKYVLSX«AyitA 
KVQRKGSAAMCHFTTKSAPPSQVWNMTVSMTSDNSMHVKCRPPR 
DRNGPHERYHLEVEAGNTLVRNESHKNCDFRVKDLQYSTDYTFK 
AYFHNGDYPGEPFIIJIHSTSYNSKALIAFLAPLIIVTSIALLW 
LYKIYDIJJKKRSCmJJEQQELVERDDEKQI^INVEPIHADILLET 
YKRKIADEGRLPIAEPQSIPRVFSKFPIKEARKPPNQNKNRYVD 
ILPYBYNRVEaCSErNGDAGSNyiNASYIDOFKBPRKYlAAOGPR 
I>ETVDD5WRM1WEQKATVIV^^VTI^CEEGNRNKCABYWPSMEEGT 
RAFGECCCKDLTKHKRCPNDYIIQIONIVNKKEKATGREVTHIQ 
FTSWPDHGVPEDPHIiLLKLRRRVNAFSNFFSGPIWHCSAGVGR 



«*.*s<^ ^^x^^^c^iL X vwi^a hiLiHP YXiHNMKKRDPPSEPS PLEAE 

FQRLPSYRSWRTQHIGNQE\ENKSKNRWSNVIPYDYNRVPLKHE 

LEMSKESEHDSDESSDDDSDSEEPSKYIWASFIMSyWKP\EV^fI 

AAQGPLKETIGDPWQMIPQRKVKVIVMLTELKHGDQEXCAQYWG 
EGKQTYGDIEVDLKDTDKSSTVTT.to'UTrTrT.-Dirc vt> vr^c.«^ 



. •'-""-^'^"*VvvxvuftJjJ/U^J^*J«>KWHiQJHKSTPIi 

LIHCRUGSQQTGIFCALLNLLESAETEEWDIPQWKALRKARP 

GMVSTFEQYQFLYDVIASTYPAQNGQVKKNNHQEDKIEFDKEVD 

KVKQDANCVNPLGAPEKLPEAXEQAEGSEPTSGTEGPEKSVNGP 
ASPAXtNQQS 

l^AWGCKLRRl,RFTyGTQTRVSIALPGQYEI.VHTLVAHQGNWBTI 

PEEDLEVQENNEDAAHDLTELEVTMHHAIiLQEVDVVVAPCXJGI,R 

PTVDVLGDLVNDFLPVITYALHKDELSERDEQELQEIRKYFSFP 

VFFFKVPKLGSEIIDSSTRRMESERSPLYRQIiIDLGYLSSSHWW 

CX3APGQDTKAQSMIiVEQSEKI,RHLSTFSHQVLQTRLVDAAKAI^ 

LVHCHCLDI FIKQAFDMQRDUJITPKRr.EYTRKKENELYESLMN 

lANRKQEEMICDMIVETLNTMKEELLDDATNMEFKDVIVPENGEP 

VGTREIKCCIRQIQELIISRLNQAVANKLISSVDYLRESFVGTL 

ERCLQSI,EKSQDVSVHITSNYLKQItJiAAyHVEVTFHSGSSVTR 

MI.WEQIKQIIQRITWVSPPAITLEWKRKVAQEAIESr.SASKLAK 

SICSQFRTRI^SSHEAFAASLRQLEAGHSGRLEKTEDLWLRVRK 

DHAPRLARLSLESRSLQDVLLHRKPKIiGQELGRGQYGWYLCDN 

WGGHFPCAI.KSVVPPDEKHWNDLALEFHYMR$LPKHERLVDLKG 

SVIDYNYGGGSSIAVLLIMERLHRDLyTGLKAGLTIiETRLQIAL 

DWEGIRFLHSQGLVHRDIKLKNVLIiDKQNRAKlTDliGPCKPEA 

MMSGSIVGTPIHMAPELPTGKYDNSVDVyAPQILFWyiCSGSVK 

LPEAFERCASKDHLVINNVRRGARPERLPVFDEECWQIJ4EACWDG 

DPLKRPI>I>GIVQPWt,QGIMNRLCKS\NSEQPWRGL DDST 

GRRHSAAMERPWGAADGLSRWPHGLGLiLLLQLLPPSTLSQDRI, " 

DAPPPPAAPLPRWSGP IGVSWGLRAAAAXGGAFPRGGRWRRSAP 

G\EDEECX5RVRDFVAiaJVNNTHQHVFDDr>RGSVSI^WVGDS'K3V| 

ILVLTTFHVPLVIMTFGQSKLYRSBDYGKNFKDITDLIMNTFIR 
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1 SEQ 
ID 
NO: 


t'redicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide n^t^'^? ^^^^ "^^r^" containing Bignal"p^ptrdS 

(A:^lanxne, C^Cysteine, D^Aapartic AcidT E=. 
location Glutan^ic Acid, F== Phenylalanine, G^GlyciAe 
tHtr^t I.lsoleucin,., K.L^.inc/ ' 
CO txrst i.»Leucine, M=Methionine, K-Asparagine 
ama.no acid P.Proline, Q^Glutamine, R^Arginine 
residue of S^Serine, T-Threonine, V=.Valine 
aiuino acxd t^=: Tryptophan, ^.Tyrosine, X=Unk^own, *=.stoo 
sequence codon, /=po33ible nucleotide deletiin ^ 
\=possible nucleotide insertion) 




5982 


5^ 


VCriAKWGSDNTIPFTTyANGSCKADLGALBI.KRTSDIK;KSFKTI 
GVKIYSFGIiGGRFLFASVMADKDTTRRIHVSTDQGDTWSMAQLP 
SVGQEQFYSILAANDDMVFMHVDEPGDTGraTIFTSDDRGIVYS 
KSLDRHLYTTTGGErDPTNVTSLRGVYITSVLSEDNSlOTMITF 
DQGGRWTHLRKPENSECnATAKNKNECSLHIHASYSISQKbNVP 
MAPLSEPNAVGIVIAHGSVGDAXSVMVPDVYISDDGGYSWTKML 
EGPHYYTILDSGGirVAIEHSSRPINVIKFSTDEGQCWQrYTFT 
RDPIYFTOIASEPGARSMWISIWGFTBSPLTSQWVSYTIDFKDI 
LERNCBEKDYTIWLAHSTDPEDYEDGCILGyKEQFJ.RLRKSSVC 
QNGRDYWTKQPS ICLCSLEDFLCDFG YYRPSNDSKCVEQPELK 
GHDLEFCLYGREEHLTTNGYRKIPGDKCQGGVNPVREVKDLKKK 
CTSNFLSPEKQNSKSNSVPIILAIVGLMLVTVVAGVLIVKKYVC 

ggrflvhlysvlqqhXaeaXngvdgvdaldtashtnksgyhdds 

1 ^EDXjLE 




5983 




■^^^*> ATRppRGSSWCRQFSRTASAAPGRSNMLRIPVRKALVGLSKSPk- 

gcvrttataasnlievfvdgqsvmvepgttvlqacekvgmoipr 

FCYHERI,SVAGWCRMCLVEIEKAPKWAACAMPVMKGWMII.TNS 

EKSKKAREGVMEFt.IJUmPLDCPrciX2GGEC33LQDQSMMFGNDR 

x^^2^2^^^^^^^^^^"^^CIQCTJ^CrRFASElAGVDD 

LGTTGRGNDMQVGTYIEKMFMSELSGNIIDICPVGALTSKPYAF 

TARPWETRKTESIDVM0AVGSN1WSTRTGEVMRILPRMHEDIN 

EEWISDKTRFAYDGX.KRQRLTBPMVRNEKGLLTYXSWEDAIiSRV 

^^^^rS^^^^^^^^^^VALKDLLNRVDSDTLC^ 

VPPTAGAGTDLRSNyi.LNTTXAGVEEADWLLVGTNPRKEAPLF 

NARIRKSWLHNDLKVALrGSPVDLTYTYDHLGDSPKlLQDIASG 

SHPFSQVLKEAXKPMWIiGSSALQRNDGAAIIAAVSSlAQKIRM 

J^fSyj^^S^'^^^^^Q^^^^^P^^II^I^PPKVLF 

pi^^^i^''^°^^^'^^^^^^««^^V°^'^^ADVILPGAAYT 

S^^r^^^^^^'^^^^^^^EDWKIIRAnSEIAGMTL 

PyDTL\DQVRWRLEEVSPNLVRYDDIEG\ANyFQQANELSKI,VN 

^EPSIC^^^^^^™^^^^^^^^^^'^^*^^^^^^^^ 




5984 


248 


1763 EARGDGGRRRHRASGRRAGRGBP\AGLKSQGQRAVPKRAVARGG 
RQXYSAAIALLEPAGSBIADDLSILYSNRAACYLKEGNCSGCIO 
iJvavKAii^ioHt^FSMKPLLRRAMAYETLEQYGKAYVDYKTVLQ 
?i'SJ??^^^^^^^^^^^^^'^EKLS^I^AVPASVPLQAWH 
PAKEMlSKQAGDSSSHRQQGrTDEKTPKALKBBOMQCVNDKKYK 
DALSKYSECLKINNKECAIYTMRALCYLKLajFEEAKODCDQAL 
S^^^^^^^^^^^^^'^^^Q^^IJ^^I^II'I'DPSII^^ 
MELEEVTRLLMLKDKTAPFWKBKBRRKIEIQEVNBGKEEPGRPA 
GBVSTGCLASEKGGKSSRSPEDPEKLPIAKPNNAYEFGQIINAL 
STRKDKEACAHLLAITAPKDLPMFLSNKLEGDTFLLLIQSLKNN 

SDTPNNHFTLEDIQALKRQYEL 


5985 


Vb5 


1193 ^^^^^^"^i^itl^^QRSVSFU^GLMRVSTG^EhRLmSPVl.'- 
TGDVGRRICRLLVGLFTKGDTSSKKVHPFSPGPCFLLCDIARVG 
SSPKINVSPFYQN\QTSTQRSCTVFVWQRCSLVGPPQVrVFTMY 




22 


1408 


RRPNPSIPSAAAGMSHIQIPPGLTELLQGYTVEVLRQQPPDI,VE 
FAVEYFTRLREARAPASVLPAATPRQSLGHPPPEPGPDRVADAK 

HPKTDEQRCRLQEACia5XDLFKNLDQEQLSQVCnAMFERIVKAD 
EHVIDQGDDGDNFYVIERGTYDILVTKDNQTRSVGQYDNRGSFG 
1 ^!:^3^^-^TIVATSEGSLWGr.DRVTFRRIIVKNWAK^ 
5f?^^f5^^^^^^^^^^^^^^^^°SKIYKR/DGERllTQGE 
K\ADSFYII2SGEVSILIRSRTKSNKDGGMQEVEIARCMKaQYF 
GELALVTWKPRAASAYAVGDVKCLVMDVQAFERIiGPCMDIMKrf 
NISHYEKQLVKMFGSSVDLGNLGQ "i^^MK^ 
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SEQ 
3D 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re sponding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acid segment containing signal peptide 
(A=Alanine, C=»Cysteine, D^Aspartic Acid, E=; 
Glutamic Acid, P- Phenylalanine, G=Glycine, 
H=Hietidine, I=Isoleucine, K= Lysine. 
L=sl*eucine, M=Methionine, N-Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S-Sarine, T=Threonine, Vt^Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /:=possible nucleotide deletion, ' 
\=possible nucleotide insertion) 


5986 


1806 


484 


' DAWKSTSLTFHWKLWGRHRGRRRGIAHPKNHLSPQQGGATPQVP"" 
SPCCRFDSPRGPPPPRLGLLGALMAEDGVRGSPPVPSGPPMfiED 
GLRWTPKSPLDPDSGLLSCTLPKGFGGQSGPEGERSIAPPDASI 
LISNVCSIGDHVAQEI.FQGSDLGMAEEAERPGEK\AGQHSPi:jRE 
EHVTCVOSIL.DEFLQT\YGSLIPLSTDEWEKLEDIPQQEPSTP 
SRKGLVLQLlQSYQRMPGNAMVRGFRVAYKRHVLTMDDIiGTLYG 
QNWL>JiXJVMNMYGDLVMDTVPEK\VHFFNSFFY\DKIiRTKGYDG 
VKRWTKNVD I FNKELUil PIHLEVHWSLISVDVRRRTITYFDSQ 
RTLNRRCPKHIAKYLQAEAVKKDRI£>FHQGWKGYFKMWVARQNN 

DSDCGAFVLQYCKHIiALSQPFSFTQQDMPKLRRQIYKEI^CHCKL 
TV 


5987 


1806 


484 


"DAWKSTSLTFHWKLWGRHRGRRRGLAHPKNHI*SPQQGGATPQVP ' 
SPCCRFDSPRGPPPPRLGLLGAI.WAEDGVRGSPPVPSGPPMEED 
GLRWTPKSPLDPDSGLLS CTLPNGFGGQSGPEGERSLAPPDAS I 

EHVTCVQSlIiDEFLQTNYGSLIPLSTDBWEKLEDIFQQEFSTP 

SRKGLVLQLIQSYQRMPGNAMVRGFRVAYKRHVLTMDDLGTLYG 

QNWIiNDQVMNMYGDLVMDTVPEK\VHFFNSFFY\DKIiRTKGYDG 

VKRVmCWDIPNKEt^LLIPIHLEVHWSLlSVDVRRRTITYFDSQ 

RTLNRRCPKHIAKYI^AEAVKKDRLDFHQGWKGYFKMNVARQNK 

DSDCXSAFVLQYCKHXiALSQPFSFO'QQDMPKLRRQIYKBLCaiCKL 
TV 


5988 


1292 


410 


FKKYPLSFI^GXiIiESSHSRDRIHNLVLMFtilATHNLVWWFTCRFQ 
RLDCIYLNAGIMPNPQLNIKAtiLFGLFS\AEGI»LTQGDKlTADG 
LQEVFEXDVFGHFILIRELEPLLCHSDNPSQLIWTSSRNARKSN 
FSI1EDFQHSKGKEPYSSSKYATDLLSVALNRNPMQQGI.YSNVAC 
PGTALTNLTYGILPPFIWTI*LMPAII*LLRFFANAFTIiTPYNGTE 
ALVWIiFHQKPESLNPLIKYLSATTGFGRNYXMTQKMDLOEDTAE 
KFYQKLLELEKHIRVTIQKTDNQARtiSGSCIi * ^ 


5989 


194 


2610 


AMDFPQHSQHVLEQLNQQRQLGliLCDCT^VVDGVHFKAHKAVLA 
ACSEYFKMtiFVDQKDVVHLDISNAAGLGQVLEFMYTAKLSLSPE 
NVDDVL\AVATFLQMQDIITACHALKSLAEPATSPGGNAEAIAT 
EGGDKRAKEBKVATSTLSRLEOAGRSTPIGPSRDLKEERGGQAQ 
SAASGAEQTEKADAPREPPPVELKPDPTSGMATVABAEAAIiSESS 
EQEMEVEPARKGEEEOKEQEEQEEEGAGPAEVKEEOSQLENGEA 
PEENENEESAGTDSGQELGSEARGLRSGTYQDRTESKAYGSVIH 
KCEDCGKEFTHTGNFKRHIRiHTGEKPFSCRECSKAFSDPAACK 
AHEKTHSPLKPYQCEEOGKSYRLISLIjm*RKKRHSGEARYRCED 
CGKLFTTSGNLKRHQLVHSGEKPYQCDYCX5RSFSDPTSKMRHLE 
THDTDKEHKCE>HCDKKFNQVGWLKAHIiKIHIADGPI*KCRECOKQ 
FTTSGNIjKRHLRIHSGEKPYVCIHCQRQFADPGALQRHVRIHTG 
EKPCQCVMCGKAFTQASSLIAHVRQHTGBKPYVCERCGKRFVQS 
SQIiANHIRHHDNIRPHKCSVCSKAPVNVGDLSKHIIIHTGEKPY 
LCDKOGRGPNRVDNLRSIIVKTVHQGKAGIKILEPEEGSEVSWT 
VDDMVTLATEAIiAATAVTQLTVVPV'GAAVTADETEVIiKAEISKA 
VKQVQEEDPNTHII»YAC35SCGDKFLDANSLAQHVRIHTAQAI*VM 
FQTDADFYQQYGPGGTWPAGQVLQAGELVFRPRDGAEGQPALAE 
A 0 r 1 Aflit-PFPAE 




5990 


2 


4700 


FGPGPDSGGGARGSGWGSRSQAPYGTLGAVSGGEQVLLHEEAGD 
SGFVSLSRLGPS LRDKDLEMEELMLQDETLLGTMQS YMDASLI S 
LIEDFGSJjGEVEMSLPDPSWDFSPPSFLETSSPKIjPSWRPPRSR 
PRWGQ3 PP PQQRSDGEEEEEVAS FSGQILAGELDNCVSS I PDPP 

mhlacpeeedkataaemavpaagdes isslselvramhpycxipn 
lthlasledelqeqpddltlpegcwleivgqaatagddleipv 
vvrqvspgprpvllddsletssalqllmptleseteaavpkvtl 
csekeglslnseeklbsacllkprewepwpkepqwppanaap 
gsqrarkgrkkkskeqpaacvegyarrlrsssrgqstvgtevts 

QVDNLQKQPQEELQKESGFIiQGKGKPRAWARAWAAALENSSPKN 
LERSAGQSS PAKEGPI/DLYPKLADTIQTNP IPTHLStiVDSAQAS 
PMPVDSVEADPTAVGPVLAGPVPVDPGLVDrASTSSEriVEPIjA 
EPVJuINPVIiADSAAVDPAVVPISDNLPPVDAVPSGPAPVDLALV 
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SEQ 
XD 
NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
aectuence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid se9ment containing signal peptide"*" 
(A-^Alanine, C=*Cy3teine, D=:Aspartic Acid, E= 
Glutamic Acid, F» Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, Ksl^ysine, 
L=l.eucine, M«Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=:Arginine, 
S-Serine, T^Threonine, Vs^Valine, 
M=Tryptophan, Y^Tyrosine, X-Unknown, ***Stog 
Codon, /-possible nucleotide deletion, 
\=po3sible nucleotide insertion) 








DPVPNDLTPVDPVLVKSRPTDPRRGAVSSALGGSAPOIiLVESES"" 

LDPPKTIIPEVKEWDSLKIESGTSArXHEARPRPLSLSEYlJRR 

RQQRQAETEEKSPQPPI'GKWPSLPETPTGLADIPCLVIPPAPAK 

KTALQRS PETPLEI CLVP VGPS PAS PSPEPPVS KPVASS PTEQV 

PSQEMPLLARPSPPVQSVSPAVPTPPSMSAALPFPAGGLGMPPS 

LPPP?LQPPSI.PLSMGPVLPDPFTHYAPLPSWPCyPHVSPS0YP 

CLPPPPTVPLVSGTPGAYAVPPTCSVPWAPPPAPVSPYSSTCTY 

GPLGWGPGPQHAPFWSTVPPPPLPPASIGRAVPQPKMESRGTPA 

GPPENVLPIiSMAPPLSLGLPGHGAPQTEPTKVEVKPVPASPHPK 

KKVSALVQSPQMKALACVSAEGVTVEEPASERLKPETQETRPRE 

KPPLPATKAVPTPRQSTVPKLPAVHPARLRKLSFLPTPRTQGSB 

DWQAFISEIGIEASDLSSLI^QFEKSEAKKECPPPAPADSLAV 

GNSGGVDIPQEKRPLDRLQAPELANVAGLTPPATPPHQI.WKPLA 

AVSLIAKAKSPKSTAQEGTLKPBGVT2AK1IPAAVRLQEGVHGPS 

RVHVG3GDHDYC\VRSRTPPKK\MPALIiIPEVGSRWNVKRHQDI 

TIKPVLSLGPAAPPPPCIAASRSPLDHRTSSEQADPSAPCIiAPS 

SLLSPEASPCRNDMNTRTPPEPSAKQRSMRCYRKACRSASPSSQ 

GWQGRHGRNSRSVSSGSNRTSEASSSSSSSSSSSRSRSRSLSPP 

HKRWRRSSCSSSGRSRRCSSSSSSSSSSSSSSSSSSSSRSRSRS 

PS PRRRSDURRRY SS YRSHDHYQRQRVLQKERAXEERRWFIGK 

IPGRMTRSELKQRPSVFGBIEECTIHFRVQGDNYGFVTYRYAEE 

AFAAIESGHKLRQADEQPFDbCPGGRRQFCKRSYSDUDSNREDP 

DPAPVKSKPDSLDFDTUiKQAQKNIiRR 


5991 


334 


1379 


RLSSHFSQCSPSIYCNTKFDKQGNVTSFERKKTELYQEbGIiQAR"" 
DLRFQHVMS I TVRNNRI IMRMEYIiKAVITPECI*I*ir»DYRNIiHLK 
0WLFR3LPSQI*SGEGQLVTYPI,PFEFRAIEAr»LQYWINTI*CX3KI» 
SILQPI*ILETl4DALGDPKHSSVDRSiCI*HILLQNGKSLSELETDl 
. KI FKES II*E ILDEEEJCLEEIiCVS KWSDPQVPEKSSAGIDHAEEM 
ELriLENYYRLADPLSNAARELRVLIDDSQSIIFrNLDSHRNVMM ^ 
RMJLOiiTMQTFSLSLFGLMGVAFGMNLESSLEEDHRIFWIiITGI 
MFKGSGliIWRRLLSFIAjR/LARSSIASYGMKDMVHGGlVEGti 


5992 


2 


609 


AGPDFRLVCGVSGSGFPGGRQGQATEWRPL.RPWNGAMBICLRRVX^ 
SGQD0EEQGr.TAQDSQINI,/SHVLDASSLSPKTRl.KWFAICFVC 
GVFFS ILGTGLLWLPGG I KLFAVFYTLGNLAALASTCPLMGPVK 
QLKKMFEATRI.IATIVMLLCFIFTLCAALWWHKKGriAVLFCIIiQ 
FLSMTWYSliSYIPYARDAVIKCCSSIiLS 


5993 


1650 


594 


AEGIiGSWAVWAGLGWAGRHMEAGGATGAIiGVGCKLPSAFCFPGS 
SVAMDMFQKVEKIGEGTYGWYKAKNRHTGQLVAIiKKlRLDt^ 
EGVPSTAIRE I SLLKELKHPNIVRLLDWHNERKLYLVFEFliSQ 
Di:.KKyi4DSTPGSELPi:iHLIKSYIiFQLI,QGVSFCHSHRVIHRDIiK 
PQNLLINELGAlKliADFGLARAFGVPLRTYTHEWTLWyRAPEI 
LIATRFYTTAVDIWSIGCI FAEMVTRKAI»FPGDS\EIDQ\LPRX 
FRNLGTPSEDTWPGVTQLPDYKGS FPKWTRKGIjEEI VPNLE PEG 
RDLLMQLriQYDPSQRITAKTALAHPYFSSPEPSPAARQYVLQRF 
RH 


S994 


394 


1934 


AGKVQLHVWIRGMRIQPQ/KAAAIIDIjDPDFEPOSRPRSCTWPL 
PRPEIANQPSKPPEVEPDLGEKVHTEGRSEPILLPSRLPEPAGG 
PQPG I liGAVTG PRKGGS RRNAWGKQS YAEIiISQAI ESAPE KRIiT 
liAQIYEWMVRTVPYFKDKGDSNSSAGWKNS IRHNI*SI*HSKFIKV 
HNBATGKSSWWMI^NPEGGKSGKAPRRRAASMDSSSKLLRGRSKA 
PKKKPSGLPAPPEGATPTSPVGHFAKWSGSPCSRNREEADMWTT 
FRPRSSSNASSVSTRIoSPIiRPESEVIiAESIPASVSSYAGGVPPT 
LNEGLELLDGLNLTSSHSLLSRSGLSGFSIiQHPGVTGPLHTYSS 
SLFS PAEGPLSAGEGCFSSSQALEAIiLTSDTPPPPADVLMTQVD 
PILSQAPIXLIiLGGLPSSSKLATGVGLCPKPLEAPGPSSLVPTL 
SMIAPPPVMAS7lPIPKALGTPVLTPPTEAASQDRMPQDLDI.DMy 
MENLECDMDNI ISDLMDEGEGnOPNPEPDP 


5995 


2 


2437 


RPPGPGPASGAWLCTRARGSAAFVPPLPRPPSRGARRRRRIjPGR 
GVAAIiRRGPGSAPGLPRGRAERSAAGSGRGPSREERGAAAAAM. 
AEMMEEIjHSL\DP\RRQEIiLEARF\TGLGVSKGPIilTSESSNQA 
CSVGSLSDKEVETPEKKQNDQRNRKRKAEPYETSQGKGTPRGHK 



421 
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SEQ 
ID 
NO: 


Predicced 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptoH^ 
(A=Alanine, Ci=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F»Phenylalanine, a=Glycine, 
HoHiGtidinc, I«Iaolcucine, K=Lysine, 
L«Leucine, M-Methionine, N=Asparagine, 
P=Proline, Q=Giutaraine, R=Arginlne, 
S=Serine, T^Threonine, V=Valine, 
W=.Tryptophan, Y^Tyrosine, X^Vnkkovn, *^Stop 
Codon, /^possible nucleotide deletion, 
\=po3sible nucleotide insertion) 




599^ 






ISDYFERRVEQPLYGLDGSAAKEATEEQSAIjPTLMSVMIjAKPRL'" 

DTEQLAQRGAGLCFTPVSAQQNSPSSTGSGNTEHSCSSQKQKI 

QHRQT\QSDLTIEKISALENSKNSDLEKKEGRIDDLLRANCDLR 

RQI\DEQQKMLEKyK\ERLNRCFDNEPRNFLIEKSKQEKMACRD 

KSMQDRLRLGHFl-I'VRHGASFTEQWTDGYAFQNLIKQQERINSQ 

REEIERQRKMLAKRKPPAMGQAPPATWEQKQRKSKTNGAENETL 

TLAEYHEQEEIFKLRUSHLKKEEAEIQAELERLERVRNLHIREL 

KRIHNEDMSQFKDHPTLNDRYLLUiriLGRGGPSEVYKAPDLTEQ 

RYVAVKIHQLMKNWRDEKKENYHKHACREYRIHKBLDHPRIVjECI, 

yDYFSLDTDSFCTVLEYCEGNOLDFYrjCQHKLMSEKEARSlIMQ 

IVNALKYIiNEX KPP I IH YDLKPGNILLVNGTACGE I KITDFtSIiS 

KIMDDDSYNSVDGMELTSQGAGTYWYLPPECPWGKEPPI-aSNK 

VDVWSVGVIFYQCLYGRKPPGHNQSQQDILOENTILKATEVQPP 

PKPWTPEAKAFIRRCLAYRKBDRIDVQQLACDPYLLPHIRKSV 

STSSPAGAAIASTSGASNNSSSN 






1612 


981 


DQQACLLGliKLiTtiEFGILEFDPSWIGSWTQR/SWVSWRSliPGCE 
LFSIWFGSIVNEGYLNSASEGEEFCIYNRNPNACSYGVAVGVL 
APLTCLLYIjALDVYFPQISSVKDRKKNAVLSGHPWSGEPHPAA 
FWAFLWFTGDSCYLXANQWQVSKPKDNPLNEGTDASPGRPSPFS 
FFS I FTWSLTAALAVRRFKDLS FQEEYSTr»PP \ ASAQP 




5997 


1612 


981 


DQQACLLGIiMLTLEPGIIiEFDPSVriGSWTQR/SWVSWRSRPGCE 
LFSIWFGSIVNEGYLNSASEGEEPCrVNRNPNACSYOVAVGVI* 
AFLTCLLYLALDVYFPQISSVKDRKKVAVLSOHPWSGBPHPAA 
PWAFl,WFTGDSCVL\;VNQWQVSKPKDNPLiNEGTDASPGRPSPFS 
FFSIPTWSLTAAIiAVRRFKDtiSFQEEYSTLPP\ASAQP 




5998 


1612 


981 


DQQACLIiGLMLTLEFG I LE FDPSWIGSWTOH/'SWVSWRSRPGCfi 
LFSI WFGS I VNEGYLNSASEGEEFCI YNRNPNACSYGVAVGVL 
AFLTCIXYLALDVYFPQISSVKDRKK\AVLSGHPWSGEPHPAA 

fwaplwptgdscylVanqwqvskpkdnplnegtdaspgrpspfs 

PFS X FTWSLTAAtAVRRFKDLSPQEEYSTLPP \ ASAQP 


s. 


5999 


2 


1790 


rppmekarrggdgvprgpvlhivwgfhhkkgcqvefsypplip 
gdghdshtlpeewkyijpfl*alpdgahnyqedtvffhijpprngna 
atvfgiscyrXqieakalkvrqabitretvqksvcvlsklplyg 

LUJAKLOLlTHAYFEEKDFSQISIIiKELYEHMNSSLGGASLEGS 

qvylglsprdlvlhfrhkglilfklillekkvlpyispvnklvg 
almtvlslfpgmiehglsdcsqyrprksmsedgglqesnpcadd 
fvsastadvshtnlgtirkvmagnhgedaamkteepi^fovedss 

KGQEPNDTNQYLKPPSRPSPDSSESDMETXiDPSVLEDPKLKERE 

qlgsdqtnlfpkdsvpseslpitvqpqantgqvvlipgi,isgle 

EDQYGMPrAIFTKGYLCLPYMALOOHHLLSDVTVRGFVAGATNI 

LFRQQKHLSDAIVEVEEALIQIHDPELRKXiLNPTTADLRFADYIi 
VRHVTFlNt^ni'ivpr.ririTY^Mpr'r'ncMTTJTvri'OT^Trtr-rTTUT r n-Mm*- a- 

V J. Ljf<ir^uij V r ijUKj iXaVt r,\jil3UttVt 1 KA^ j A. VY T n ATi*! > AATliQijV 

L,FRIVNVAKKIGMVMVTr\SRNWQTGK\AVGQSVGGAFS\SAK 
TA\MSSWLSTFTTSTSQSLTEPPDEKP 


t 


6000 


101 


xS61 


rEPCRTAENCTATMSENNKNSIjESSIiRQZiKCHFTWNLMEGENSL 
DDFEDKVFYRTEPQNREFKATMCNIjIiAYLKHLKGQNEAALECLR 
KAEELIQQEHADOASIRSLVTWGNYAWVYYHMGRLSDVQIYVDK 
VKHVCEKFSSP YR I ES PELDCEEGWTRLKCGGNQNERAICVCFEK 
ALEKKPKNPEFTSGIAIASYRLDNWPPSQNAIDPLRQAIRLNPD 
NQyLICVLLALKLHKMRBEGEBEGEGEK\l.VEEAi:.BKAPG\VTOV 
LRSAA\KFYRGKDEPDKAIELIiKKALEYrP\NNAYI»HCQlGCCY 
RAKVFQVMNLRBNGlCtrGKRKLLELIGHAVAHLKKADEANDNliFR 
VCS I rASIiHAIiAiDQ YEDAE Y YFQKEF5 KEIiTPVAKQLIjHLRYGN 
FQLYQMKCEDKAlHHFIEGVKINQKSREKEKMKDKLQKIAKMRL 

SKNGADSEALHVLAFLQEIiNEKMQQADEDSERGIiESGSLlPSAS 
SWNGE 




6001 


176 


1038 


AFAHSPSRGHRKTHIHTPRHTPRCTMAESHLQSSLr'TASQFPEI 
WLHFDADGSGYLEGKELQNXiXOELQQARKKAGLELSPEMKTFVD 
QYGQRDDGKIGIVELAHVLPTEENFLLLPRCQQI*KSCE\EFMKT . 
WRKYDTDHSGFIETEELraJFLKDLIiEKANKTVDDTKLAEYTDLi | 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 

amino acid 
sequence 


Amino acid segment containing signal peptxHe" 
(A=Alanine, C=Cysteine, D=Aspartic Acid, 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I^Isoleucine, K~Lysine, 
L= Leucine, M-Methionine, N»Asparagine, 
P=; Proline, Q=Glut amine, R-Arginine, 
S=Serine, T:=Threonine, V«: Valine, 
W=Tryptophan, Y-Tyrosine, X«Unknown, *=Stop 
Codon, /=posslble nucleotide deletion, ' 
\-possible nucleotide insertion) 








LKLFDSNWDGKLELTEMARLLPVQBNFLLKPQGIKMCGKEFNKA 
FELyDQDGNGYlDENELDALIiKDLCEKWKQDbDINNITTYKK^I 
MALSDGGKLYRTDLAIU 1 LCAGDN 


6002 


977 


ei 


LAPPGGGLHIPPRTPLSHSRPPPSHHAPHPSPLPLPPADLHPHS 
SMAQRSDLIjEIiDCQI*TRDR\AAn;SHDENLCRQSGLNRDVGSLDP 
EDLPLYKEKI.EVYFSPGHPAHGSDRRMVRLEDI1FQRPPRTPMSV 
EIKGKNEELIREQ/VLVRRYDRNEITIWASEKSSVMKKCKAANP 
EMPLSFTISRGFWVLLSYYUSLLPFIPIPEKPFFCFLPNIINRT 
YFPFSCSCLUQLIiAWSKWIilMRKSLIRHLEERGVOWFWCLKE 
ES DFEAZVFSVGATGVITDYPTALRHYLDNHGPAARTS 


6003 


140 


4098 


GKIiRAFRQMRRLlCKRICDYKSFDDEESVDGNRPSSAASAFKVP 

APKTSGNPANSARKPGSAGGPKVGAGASKEGGAGAVDEDDFIKA 

KIWPSIQIYSSRELEETI^KIREILSDDKHDWDQRAKALKKIR 

SLIiVAGAAQYDCFFQHLRLLDGALKLSAKDLRSQWREACITVA 

KLSTVLGNKFDHGAEAIVPTLFNLVPNSAKVMATSGCAAIRFII 

RHTHVPRLIPLITSNC:?SKSVPVRRRSFEPLDI^QEWQTHSLE 

RHAAVIiVETIKKGIHDADAEARVEARKTYMGLRNHFPGEAETLY 

KSLEPSYQKSLQTYLKSSGSVASLPQSDRSSSSSQESLNRPPSS 

KWSTAITPSTVAGRVSAGSSKASSLPGSLQRSRSDIDVHAAAGAK 

AHHAAGQSVRSGRU5AGALNAGSYASUBDTSDKUDGTASEDGRV 

RAKIiSAPIiAGMGMAKADSRGRSRTKMVSQSQPGSRSGSPGRVIiT 

TTALSTVSSGVQRVLVNSASAQKRSKIPRSQGCSREASPSRLSV 

ARSSRIPRPSVSQGCSREASRESSRDTSPVRSFQPLASRHHSRS 

TGALYAPEVYGASGPGYGISQSSRIiSSSVSAMRVI^rrGSDVEEA 

VADALLLGDIRTKKKPARRRYESYGMHSDDDANSDASSACSERS 

YSSRWGSIPTYMRQT\EDV\AEVIiNRO\SSmrSEftKEGLIiGLQN 

UjKNQRTLSRVELKRLCE I FTRMFADPHGKRVFSMFLETliVDFI 

QVHKDDLQDWI>FVI,LTQLLKKMGADia.GSVO?VKVQKArj[>VTRES 

FPNDI^FNIlAIRPTVDQTQTPSLKViCVAILKyiETlAKQMDPGD i 

FINSSETRLAVSRVTTWTTEPKSSDVRKAAQSVLISLFELNTPE 

FTMLLGALPKTPQDGATKLLHNHLRNTGNGTQSSMGSPLTRPTP 

RSPANWSSPLTSPTNTSQJNTLSPSAFDYDTEKIVINSEDIYSSLRG 

VTEAIQNFSFRSQEDMNEPI,KRDSKKDDGDSMCGGPG\MSDPRA 

ggdatdssqtalXdnkasllhsmpthssprsrdynpynysdsis 

PFNKSAXiKEAMFDDDADQFPDDLSLDHSDLVAEIiIiKRttSNHMER 
VEERKIALYELMKLTQEESFSVWDEHFKTILLIJiETLGDKEPT 
IRALALKVLREILRHQPARFKNYAELTVMKTLEAHKDPHKEVVR 
SABEAAS V\LATS I \ S PEQCI KVLCP I IQTADYP INXAAI KMQT 
KVIERVSKETLNLLLPEIMPGLIQGYDiTSESSVRKACVPCIiVAV 

HAVIGDELKPHLSQLTGSKMKLLNLYIKRAQTGSGGADPTTDVS 
GQS 


6004 


140 


4098 


GKI.RAFRGMRRLXCKRICDYKSFDDEBSVDGNRPSSAASAFKVt> " 
APKTSGNPANSARKPGSAGGPKVGAGASKEGGAGAVDBDDFIKA 
FTDVPS I Q I YSSRELEETLMKIREILSDDKHDWDQRANALKKIR 
SLLVAGAAQYrXTFFQHLRIXDGAtiFCLSAKDLRSQVVREACITVA 
HLSTVLGMKPDHGAEAIVPTX.PNrLVPNSAKVMATSGCAAIRFII 

RHTHVPRLXPLITSWCTSKSVPVRRRSFEFLDtiLLQEWQTHSLE 
RHAAVLVETI KKG I HDADAEAR VRAR KTYMf3T .pmm iPDr-T? B.i?n»r v 

KSLEPSYQKSLQTYLKSSGSVASLPQSDRSSSSSQESLNRPFSS 

KWSTANPSTVAGRVSAGS5KASSLPGSLQRSRSDIDVNAAAGAK 

AHHAAGQSVRSGRI^AGAIJtlAGSYASLEDTSDKLDGTASEDGRV 

RAKLSAPLAGMGNAKADSRGRS RTKMVSQSQPGSRSGS PGRVLT 

TTALSTVSSGVQRVLVNSASAQKRSKIPRSQGCSREAS PSRLS V 

ARSSRIPRPSVSQGCSREASRESSRDTSPVRSFQPLASRHHSRS 

TGALYAPEVYGASGPGYGISQSSRLSSSVSAMRVIiNTGSDVERA 

VADALLLGDXRTKKKPARRRYESYGMHSDDDAKSDASSACSERS 

YSSRNGSIPTYMRQT\EI)V\AEVI,NRCASSIIWSERKEGLU3i:.QN 

LLKNQRTIjSRVELKRLCEIFTRMFADPHGKRVFSMPLETLVDPI 

□VHKDDLQDMI,FVIXTQLLKKMGADLLGSVQAKVQKAEJ)Vr^ 

FPNDI^FNIl^RFTVDQTQTPSLKVKVAILKYrErriiAKQMDPGl^ 

PINSSETRl^VSRVITWTTEPKSSDVRKAAQSVLISLPEliNTPE 
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SEQ 
ID 
NO: 


Predxcted 
beginning 
nucleotide 
location 
ccrre spending 
to first 
oirnino a.cid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 

^c:c>xuuc v_> i_ 

amino acid 
sequence 


Ammo acxa segment containing signal peptid^- 
(A^Alanine, C^Cysteine, D^spartic Acid, 
Glutamic Acid, F=Phenylalanine, G-Glycine, 
H-Histidine, I^Isoleucine, K^^Lyeine, 
L=Leucine, M^Methionine, N»Asparagine, 
P»Prolane, Q^^Glutaroine, R«Arginine, 
S=Serine, T=Threonine, V=rValine, 
W^Tryptophan, Y«Tyrosine, X«Unlcnown, *^Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 










e-rr^I^ALPKTFQDGATKLIJlNHLimGWGTQSSMGsK:^^ 

RSPANWSSPLTSPTKTSQNTLSPSAPDYDTENMNSEDIYSSKIG 

VTEAIQNPSFRSQEDMMEPLKRDSKKDDGDSMCGGPGXMSDPRA 

GGDATDSSQTAL\DNKASLLHSMPTHSSPRSRDYNPYNYSDSIS 

PFNKSAIiKEAMPDDDADQFPDDLSLDHSDIiVAEUCKELSNHKER 

VEERiaAlirELMiaCTQEESFSVWDEHFKTILiitiliLETLGDKEPT 

IRAliALKVIiREILRHQPARFKNyAELTVMKTLEAHKDPHKEWR 

SAEEAASV\IATSI\SPEQCIICVLCPIlQTADypiNLAAIKMQT 

KVIERVSKETI*NLLLPEIMPGLIQGYDNSESSVRKACVPCLVAV 

HAVIGDELKPHLSQLTGSK^IKLLKI,yIKRAQTGSGGADPTTDVS 
GOS 




6005 


i33 


S9SS 


RSSGRRQEQliGQFPtiKERKGMASGLGSPSPCSAGSEEEDMDALiT 

NNSLPPPHPENEEDPEEDLSETETPKLKKKKKPKKPRDPKIPKS 

KRQKKERMLLCRQLGDSSGEGPEFVEEEEEVAUISDSEGSDVTP 

GKKKKKKLGPKKEKKSICSKRKEEEEBDDDDDDDSKEPKSSAQLL 

BDWGMEDIDHVFSEEDYRTLTNYKAFSQFVRPXiXAAKNPKIAVS 

KMMMVI/SAKWRSFSTNNPFKGSSGASVAAAAAAAVAVVESMVTA 

TEVAPPPPPVEVPIRKAKTKEGKGPNARRKPKGSPRVPDAKKPK 

PKKVAPLKTKLGGFGSKRKRSSSEDDDLDVESDFDDASINSYSV 

SDGSTSRSSRSRKKi:,RTrKKKKKGEEEVTAVDGYBTDHQDyCEV 

CQOCIGEIII.CDTCPRAYHMVCLDPDMEKAPEGKWSCPHCEKEGI 

QWEAKEDNSEGEBIIiEEVGGDLEEEDDHHMEFCRVCKDGGELLC 

CDTCPSSYHIHCLNPPLPEIPKGEWLCPRCTCPALKGKVQKILI 

WKWGQPPSPTPVPRPPDAPPNTPSPKPLBGRPERQPPVKWQGMS 

YWHCSWVSELQI*ELHC\QVMFRNYQRKNDMDEPPSGDPGGDEEK 

S\RKRKNKDPKFAEMEERFYRYGIKPEW\MMIHRII*IKSVDKKG 

HVHYLIKWRDLPYDQASWESEDVEIQDYDLFKQSYWNHREXiMRG 

EEGRPGKKLKKVKIURKIiERPPETPTVDPTVKYERQPEyLDATGG 

TIjHPyQMEGLmJLRFSWAQGTDTILADEMGI^KrVCyrAVFDYSr. 

YKEGHSKGPFLV$APLSTIIN\WEREPEMWAPDMYV\VTYVGDK 

DSRAI IREKEFS \ FEDNAIRGGKKASRMKKEAS VECFHVIJiTSYE 

LITIDMAILGSIDWACLIVDEAHRLKNNQSKFFRVLNGYSLQHK 

Lr.LTGTPLQNiyriiEEt,FHLLNFLTPERFHNLEGFI.EEFADIAKED 

QIKKLHDMLG\PHMLRRLKADVFKNMPSKTEI,IV\RVEI,SPM\Q 

KKYYK\YILHSKFLKALN\ARGGGNQVSLLNVVMDLKKCCNHPY 

LFPVAAMEAPKMPNG^ryDGSALlRASGKU*LLQKM^lKKI,KEGGH 

RVLIFSQMTKMLDIJaEDFLEHEGYKYERIDGGITGNMRQEAIDR 

FNAPGAQQFCFIiIjSTRAGGLGX1«ATADTVIIYDSDWNPHNDIQ 

AFSRAHRIGQNKKVMIYRFVTRASVEERITQVAKKKMMIiTHLW 

RPGLGSKTGSMSKQELDDILKFGTEELPKDEATDGGGDNKBGED 

SSVIHYDDKAIERLt,DRNQDETEDTEI,QGMNEyiiSSFKVAQYW 

KtKEMQE BEE VBRE 1 1 KQEESVDPDY WEKLLRHHYEQQQEDLAR 

NLGKGKRIRKQVNYNDGSQEDRDWQDDQSDNQSDYSVASEEGDE 

DFDERSEAPRRPSRKX3LRNDKDKPLPPLLARVGGNIEVLGFIIAR 

QRKAFLNAIMRYGMPPQDAFTTQWLVRDIiRGKSEKEFKAYVSLF 

MRHLCEPGADGAETFADGVPRE6LSRQHVLTRIGVMSLIRKKVQ 

EFEHVNGRWSMPELAEUEENKKMSQPGSPSPKTPTPSTPGDTQP 

NTPAPVP PABDG I KIEENS LKEEES lEGEKEVKSTAPETAIECT 

OAPAPASEDEKVWEPPEGEBKVEKAEVKERTEEPMETEPKGKG 

AADVEKVEEKSAIDIiTPIWEDKEEKKEEEEKKEVMLQNGETPK 

DUTOEKQKKNIKQRFMFNIAI^GGFTBI/HSLWQNEERAATVTKKT 

YEIWHRRHDYWLLAGriNHGYARWQDIQNDPRYAILMSPFKGEM 

NRGNFLEIia!irKPIARRPKI.LEQALVIEEQLRRAAYLNMSEDPSH 

PSMALNTRFAEVECIAESHQHLSKESMAGNKPANAVLHKVLKQI, 

EELLSDMKADVTRLPATIARIPPVAVRLQMSERNILSRLANRAP 

EPTPQQVAQQQ 




6006' 


1 


965 

] 


DNDFLRNTVHRHEPPVTAEPIRLLAEWEDVVVVDKPSStPVHPC""" 
GRFRHNTVI FILGKEHQLKELHPLHRl*DRr,TSGVI.MFAKXAAVS 
ERIHEQVRDRQLEKEYVCRVEGEFPTEEVTCKEPILWSYKVGV 
CRVDPRGKPCETVFQRLSYNGOSSWRCRPLTORTHQrRVHLQil' 
EiGHPILNDPrYNSVAWGPSRGRGGYIPKTNEEDLRDLVAEttQAK 
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SEQ 
ID 
NO: 


j Predicted 

1 beginning 
nucleotide 
location 
correspond! ncf 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide'"" 
(A=:Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H=Histidine, I-Isoleucine, K-Lysine, 
L==Leucine, M»Methionlne, M=Asparagine , 
P=Proline, Q=Glut amine, R^Arginine, 
S-Serine, T«Threonine, V«Valine, 
W=Tryptophan, Y-iyrosine, X^Dnknown, *=Stop * 
Codon, /-possible nucleotide deletion, 
\»poasible nucleotide insertion) 








QSLDVlJ)LCEGDLSPGLTDSTAPSSELGKDDIiEELAAAA\QKME"~ 

EVAEAAPQEbDTIAIiASKKAVETDVMNQ\RQT\TLCRVPAGATG 
S LAPRPCDVPTCPTL 


6007 


3 


2351 


HEI^VEYVFTDKTGTLTENEMQFRECSINGMKYOEINGRLVPE" 

GPTPDSSEGNLSVLSSLSHIiNNLSHLTTSSSFRTSPENETELIK 

EHDIiFFKAVSLCHTVQIMNVQTDCTGDGPWQSNtiAPSQLEYYAS 

SPDEKALiVEAAJXPTf!TVtJ*Tr'MCPir*rMn*tfv*r*T ....i . , „ 

fi*r^K.i\jxLJv atu^^M^x^xv J* J.^»JNiJEETMBViU!XiGKtiERYKI»LHII*E 

FDSDRRRMSVIVQAPSGEKLLFAKGAESSILPKCIGGEIEKTRI 

HVDEFALKGLRTLCIAYRKFTSKEYEEIDKRrFEARTALQQR\E 

EKIiAAVFQPIEKDLILLGATAVKDRLQDKVRETIEALRMAGIICV 

WVLTGDKHETAVSVSLSCGHFHRTMNILELINQKSDSECAEQLR 

QLARRlTEDHVIQHGLWDGTSLSLALREKEKIiFMSVCRNCSAV 

LCCRMAPLQKAKVIRLIKISPEKPITIAVGDGANDVSMIQEAHV 

GIGIMGKEGRQAARNSDYAIARFKFLSKLIiFVHGHFYYIRIATL 

VQYPFYKNVCFITPQFIiYQFYCLFSQQTIiYDSVYLTLYXNICFT 

SliPILIYSLLBQHVDPHVLQNKPTIiYRDISKNRLLSIKTFLyWT 

ILGFSHAFIFFFGSYLLIGKDTSLLGMGQMFGNWTPGTLVPTVM 

VITVTVXMALETHPWTWINHLVTWGS I XFYFVPS LFYGGl LWPP 

I^SQN^aYPVFIQLI,SSGSAWFAIILMVVTC£.FLDrfKKVFDRHL 

HPTSTEKAQLTETNAGlKCIiDSMCCPPEGEAACASVGRMLERVI 

GRCSPTHISRSV?SASDPFYTNDRSII,TLSTMDSSTC 


600B 


4SS4 


1089 


AGVRRAGARRGPGRALPAGATAVpppSARRRRRCPAPEHAGPAR 

ASRPSQETMFQIiPVNMLGSLRKARKTVKKIJLSDIGLEYCKEHIE 

DFKQFEPNDFYLKNTTWEDVGLWDPSLTKNQDyRTKPFCCSACP 

FSSKFFSAYKSHFRNVHSEDPEtJRILLNCPYCTFNADKKTLETH 

IKIFHAPNASAPSSSLSTFKDKKKNDGLKPKQADSVEQAVYYCK 

KCTYRDPLYEIVRKHIYREHFQHVAAPYIAKAGEKSLNGAVPLG 

SNAREESS1HCKRCI,FMPKSYEALVQHVIEDHERIGYQVTAMIG 

HTNWVPRSKPLMLIAPKPQDKKSMGi:,PPRIGSlASGNV\RSbP ^ 

SQQfWNRLSIPKPNIiNSTGVNMMSSVHIiQQNMYGVKSVGCGYSV 

GQSMRLGLGGNAPVSlPCMSQSVKQLLPSGNGRSyGLGSEQRSQ 

APARYSLQSANASSLSSGQLKSPSLSQSQASRVLGOSSSKPAAA 

ATGPPPGNTSSTQKWKICTICNHLFPEKTVYSVHFEKEHKAEKVP 

AVANYlMXIHNFTSKCLyCNEYT.PTTVrT TMMMT Tur»T c«/-ir»v/-*r»o 

TFNDVEKMAAHMRMVHIDBBMGPKTDSTI*SFDLTLQQGSHTNIH 
U,VTTYNI»RDAPAESVAYHAQNNPPVPPKPQPKVQEKADIPVKS 
SPQAAVPYKKDVGKTLCPLCFSILKGPISDAXAHHLRERHQVIQ 
TVHPVEKJOiTYKCIHCIiGVYTSNMTASTlTLHLVHCRGVGKTQN 
GQDKlNAPSRLNQSPSIAPVKRTYEQMEFPLaKKRKLDDDSDSP 
SFFEEKPEEPWLALDPKGH\EDDSYEARKSFLTKYFT\KaPYP 
TRRE1EKIJ^SLWV\WK\SDIASHFSNKRKKCVRDCEKYKPGVL 
LGFN>^KELNKVKHEMPFDAEGLPENHDEKI>SRVNASK'rADKKI*N 
LGKEDDSSSDSPENLEE3SNESGSPFDPVFEVEPKISNDNPEEH 
VLKVIPEDASESEEKLDQKEDGSKYETIHLTEEPTKLMHNASDS 
EVDQDDWEWKDGASPSESGPGSQQVSDFEDNTCEMKPGTWSDE 
SSQSSDARSSKPAAKKKATMQGDREOLKWKNSSYGKVEGPWSKD 
QSQWXNASENDERLSNPQIEWQNSTrDSEDGEQFDNMTDGVAEP 
MHGSIAGVKLS SQQA 


6009 


4272 


1534 

: 

] 

} 
I 
I 
{ 


CHGLQHLTPFREI*NLSLQG*EPH*AA*QAVRSEEKSIC*GSPSC ' 

HLVLGVI.VPVARQSSHSAGPAQSAFR*TGTGSGTPKAAEQSGYW 

EAYTLGHQHWNMFPIQRPPLVMKORRIMCGKCBKG*VSDSVXGG 

RAVAGEQASQRRTVFTAGGGEClKSAKSVRASVFTCNQPGVMGLr, 

NGKRGGCFESGYLFGFIVIGKIQSLEAKVPLPWGQTCERASPG 

MCRIHIVtlAVC*SEHH*DHFLAAAFLEtISTlIS*VAPGSWQDHA 

irtKJKEVQASVRCRGFESVDTAPAGPWAHSPPGLQGEPTTTSVSL 

FVIAPQDGEGVPFVEGQLVTVLGLVVPQSIRHTFVHHTQLFLHP 

r*KLGALDVAFLliLLTLVCSSPNVAYG*GKMGGTTLHQLFABVN 

^VTRGSAVQRRPSXTISSIHVDTKIQQEUJDVMVAGADGWQWG 

3PFWGlAGIPHIiIDDPLHQIELSFQRRV*EQCQGVKPDSQPVP 

iPLRVGLLQVGPLVRGGGRRVAGRGKRCWRDLLFPWRWGLSHRTf 

U3LLRGGDRGHVWIVIiaiLGSLVGGIiGXDELIiWFGGR*IiII IG 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

COT**"** ftrtrtM*^ i ncr 

to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide^ 
(A=;Alanine, C-Cysteine. D^^Aspartic Acid, 
Glutamic Acid, F»Phenylalanine, G=Glycine 
li^ilxatxdxne, I«lsoleucine, K=Lysine, 
L=Leucine, M=Methionine, Ns^sparaglne , 
P- Proline, Q-Glut amine, R-Arginine, 
S=Serine, T-Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X=UnJcnown, **=Stop. 
Codon, /=possible nucleotide deletion, 
\"POssible nucleotide insertion) 








I * * RGRLSGEWGCGLGRGELFQVS IGIGVS I VHIGQGDHEVfiGG- 

AGLVERGALHATGQGVEALVCXJLLDVGPAGALGICDGAALFQtSP 

GRVGOLPAEGLQVCITLVAQWRMHDGRELGGAEWPWQALHGAAI 

CGVGGAILLKAI>SQYFLKGG*RLWCARGQ*PVKKRQRRWRG*TR 

R*NGLTIHCFN* LI *GAVCCRLVILRWCGLI,EVHGVYOT* IHCL 

GSPPGRLWP+PPISQERPNGHCQVJBFRIAVPSWKCRWSRWRVRG 

TWRYGKPLLNLL*GAWLGGAACGGQQGGPLSTWQACTGPGQAAP 

LPPrOGACRPRTQRCRTWVGPIAWRQLLArrRD 


6010 


1 


3533 


IMPCGSSRLLRGCWTHPNEPVSDLSYFDCIESVMENSKVLGESM'- 

AG I SQNAKTGDLPAFGECVGI AS KALCGLTEAAAQAAYLVGrFD 

PNTSQAGHQGLVDPIQFARANQAIQMACQNLVDPGSSPSQVLSAA 

TIVAKHTSALCNACRIASSKTANPVAKRHFVQSAKKVANSTANL 

VKTI KALDGDFS EDNRNKCRl ATAPLI EAVENLTAPASUPEFVS 

IPAQISSEGSQAQEPIIiVSAKPMI.ESSSYI*IRTARSrAIKPKDP 

PTWSVLAGHSHTVSDSIKSLITSIRDKAPGQREGDYSIDGINRC 

IRDIEQASLAAVSQSLATRDDISVEAriQEQLTSWQEIGHLIDP 

lATAARGEAAQLGHKGTQLASYFEPLILAAVGVASKILDHQQQM 

TVLDQTKTIJ^ESALQMLyAAKEGGGNPBCAQHTHimTEAAQL^ 

EAVDDIMVXr.NFAASEVGLVGGMVOAIAEAMSKLDEGTPPEPKG 

TFVDYQTTWKYSKAIAVTAQEMMTKSVTNPBEIiGGIASQMTSD 

YGHLAFQGQMAAATAEPEEIGFQIRTRVQDLGHGCIFLVQKAGX 

ALQVCPTDSYTKRELXECARAVXEKVSIiVLSALQAGNKGTQACI 

TAATAVSGIIADLDTTIMFATAGTLNAENSETFADHRENILKTA 

KALVEDTKLLVSGAASTPDKIiAQAAQSSAATITQlAEVVKLQAA 

SLGSDDPETQWLINAIKDVAKALSDLXSATKGAASKPVDDPSM 

YQLKGAAKVffVTWTSLLKrVKAVEDEA'rRGTRALBATIBCIKQ 

ELTVFQSKDVPEKTSSPEESIRMTKGITMATAKAVAAGSNSCRQE 

DVIATANLSRKAVSDMLTACKQASFHPDVSDEVRTRAUIFGTEC 

TLGYI.DLLEHVLVIZ,QKPTPELKQQXAAFSKRVAGAVTEI.IQAA ' 

EAMKGTEWVDPBDPTVIABTET^LOAAASIEAAAKKLEQLKPRAK 

PKQADETLDPEEOlLEAAKSZAAATSAIiVKSASAAQRELVAQGK 

VGSIPANAADDGQWSQGLISAARIWAAATSSLCEAANASVQGHA 

SEEKLISSAKQVAASTAQI.LVACKVKADQDSEAMRRLQAAaWAV 

KRASDNLVRAAQKAAFGKADDDDVWKTKFVGGIAQI lAAQEEM 

LKKERELEEARKKLAQ IRQQQYKFLPTELREDEG 


6011 


446 


1835 


LLQPAMRKSPGX^DCLWAWILLLSTLTGRSYGQPSLQDELKPNT 
TVFTRILDRLLDGYDNRI^RPGLGERVTE VKTDIFVTS PGPVSDH 
DMBYTIDVFFRQSWKDERLKFKGPMTVIiRIiNNLMASKIWTPDTF 
FHNGKKSVAHNMTMPNKLLRITSDGXI»IiYTMRLTVR\AECPMAF 
GRDFPM\D\AHACPLKFGSYAYTRAEWYEWTREPARSV\ArAED 
GSRIiNQYDIiliGQTVDSGlVQSSTGEYWMTTHPHLKRKIGYFVI 
QTYI.PCIMTVILSQVSFWLNRESVPARTVFG\rrTVLTMTTLSIS 
ARTfSLPKVAYATAMDWFlAVCYAFVFSAl^IEEATVKYFTKRGYA 
WDGKSVVPEKPKKVKDPLIKKNNTYAPTATSYTPNLARGDPGLA 
TIAKSATIEPKEVKPETKPPEPKKTFNSVSKIDRLSRI7VFPLLF 
QIFNLVYWATYI,NREPOLKAPTPHQ 


6012 


351 


5013 

J 
1 


PAELFQS FAI WHKEIi YDWRLGP WNQCQPVI S KSLEKPLECI KGE 
EGrQVREIACrQKDKDrPAEDirCEYFEPKPLLEQACblPCQQD 
CIVSEFSAWSECSKTCGSGLQHRTRHWAPPQFX3GSGCPNLTEP 
QVCQSSPCEAEELRYSLHVGPWSTCSMPHSRQVRQARRRGKNKE 
REKDRSKGVKDPEARELlKKKRl^RNRQl«iQBNKYWDIQrGYQTR 
EVMCXNKTGKAADLSFCQQEIOiPMTPQSCVrTKECQVSEWSEWS 
PCSKTCHDPIVSPAGTRVRTRTIRQFPIGSEKECPEFEEKEPCLS 
QGDGWPCATYGWRTTEWTECRVDPLLSQQDKRRGNQTAIiCGGG 
IQTREVYCVQANENLIiSQLSTHKNFCEASKPMDLKLCTGPIPNTT 
QLCHIPCPTECEVSPWSAWGPCTYBN<3n)QQGKKGFKLRKRRIT 

neptggsgvtgncphi.l2aipceepacyd«kavrlgix:epdngk 

ECGPGTQVQEWCINSDGEBVDRQLCRDAIFPIPVACDAPCPKD 

cvlstwstwsscshtcsgkttegkqirarsilayageeggircp . 
^ssalqevrs cne3ipctvyhwqtgpwgqciedtsvssfirrttt# 

^GEASCSVGMQTRKVXCVRVNVGQVGPKKCPESURPETVRPCLL | 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide"" 
(A=:Alanine, C=Cysteine, D^^Aspar tic Acid, E= 
Glutamic Acid, F« Phenylalanine, G^^lycine, 
H-Hi3tidine, I»Isoleucine, K=Ijysine, 
L-Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q-Glut amine, Rs=Arginine, 
S-Serine, T=Threonine, V=valine, 
W:=Tryptophan, Y=:Tyrosine, X=.Un)cnown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide incertion) 










PCKiCDCIVTPYSDWTSCPSXSCKEGDSSIRKQSRHRVIIQIjPAir" 

GGRDCTDPLYEEKACEAPQACQSYRW\KTHKW\HRCQ\LVP\WS 

VQQDSP\GAQEGCX5PGRQARAITCRKQDGGQAGIHECLQYAqpv 

PAUTQACQIPCODDCQLTSWSKFSSCNGDCGAVRTRKRTIiVGKS 

KKKEKCKNSHLYPLIETQYCPCDKYNAOPVGWWSDCILPEGKVE 

VLLGMKVQGDIKECGQGYRYQAMACYDQNGRIiVETSRCNSHGYI 

EEACI I PCPS DCKLSEWSNWSRCS KS CGSGVKVRSKWLRSKPYN 

GGRPCPKLDHVNQAQVYEWPCHS0CNQYLWVTEPWS ICKVTFV 

NMRENCGEGVQTRKVRCMQNTADGPSEHVEDYLCDPEEMPLGSR 

VCKLPCPEDCVISEWGPWTQCVLPCNQSS FRQRSADP IRQPADE 

GRSCPNAVEKEPCNLNKWCYHYDYNVTDWSTCQIiSEKAVCGNGI 

KTRMLDCWSIXSKSVDLKYCEAUSLEKimQMKTSCMVECPVNCQ 

LSDWSPWSECSQTCGLTGKMIRRRTVTQPFQGDGRPCPSLMDQS 

KPCPVKPCYRWQYGQWSPCQVQEAQCGEGTRTRNISCWSDGSA 

DOFSKWDEEFCADIELI ItX3NKNMVI»EESCSQPCPGDCYLKDW 

SSWSLCQLTCVNGEDLGFGGIQVRSRPVIIQELENQHLCPEQML 

ETKSCYDGQCYEYKWMASAWKGSSRTVWCORSDGINVTGGCriVM 

SQPDADRSCNPPCSQPHSYCSETKTCHCEEGYTEVMSSNSTLEQ 

CTLIPVWLPTMEDKRGDVKTSRAVHPTQPSSNPAGRGRTWFLQ 

PFGPDGRIiKTWVYGVAAGAPVLIilFIVSMIYLACKKPKKPQRRQ 

NNRLKPLTIAYDGDADM 




■ 6iOX3 


11^1 


710 


GAFIAGVJfV^iPVIilRYPNSLDTTSWAWRGPGVI^KVLKLTASQPC 
SIVDVEPLPVYHPSPEESRDPTLYANNVQRVMAQAIiGIPATECE 
FVGSbPVIWGRLKVALEPQti/WGTGKSASEGWAVRKLCGRWGR 
ARPESNDQPC5RVC0AATA1* 




£014 


2857 


613 


EAVAGGMEKSRMNLPKGPDTLCFDKDSFMKEDFDVDHFVSDCKK 

RVQIiEEIJWDDLELYYKLliKTAMVELINKDYADFXVNLSTMLVGM 

DKALNQLSVPLGQLREEVLSLRSSVSEGIRAVDERMSKQEDIRK 

KKMCVLRLIQVIRSVEKI EiaLNSQSSKETSALEASSPLLTGQI ^ 

LERIATEFKOIbQFHACQS K\GMPLLDKVRPRI AG ITAMLQQSI.E 

GIjLLEGEXJTSDVDI IRHCLRTTf ATIDKTRDAEALVGQVLVKPYI 

DEVI lEQFVESHPNGLQVMYNKIiIiEFVPHHCPOiLREVTGGAISS 

EKGNTVPGYDFTiVNSVWPQIVQGLEEKLPSLFNPGNPDAFHEKY 

TISMDFVRRLERQCGSQASVKRX/RAHPAYHSFNKKWNriPVYPQI 

RPREIAGSLEAALTDVI,EDAPAESPYCU^HRTWSSLRRCWSD 

EMFLPLI,VHRIiWRLHSGRFWARYSVFV\N\ELSLRPISNESPKE 

IKKPLVTGSKEPSITQGNTEDQGSGPSHTKPWSISRTQIjVYVV 

ADLDKLQEQLPELLEI I KPKLEMIGFKNFSS ISAALEDSQSS FS 

ACVPSLSSKlIQDI,SDSCFGFr»KSAIiEVPRLYRRTNKEVPTTAS 

SYVDSAIiKPIjFQLQSGHKDKLKQAIIQQWLEGTIiSESTHKYYET 

vsdvlnsvkkmeeslkriikqarkttpanpvgpsggmsdddkirii 
qlaldveylgeqiqkxglqasdiksfsaljvelvaaakdqataeq 

P 




6015 


13 


2237 


AEGCAERRGTEPWELSMSWESGAGPGIiGSQGMDI/VWSAWYGKC 
VKGKGSLPjUSAHG I WAWLSRAEWDQVTVYLPC0DHKLQRYALN 
RITVtJRSRSGNELPIAVASTADLIRCKIaliDVTGGLGTDEIiRLLy 

gmalvrfvnliserktkfakvplkclaqevnipdwivdlrheiit 
hkkmphindcrrgcyfvldwlqktywcroj:iEnslretwei:,eefr 
egieeedqeedkwrwdditeqkpkpqddgkstesdvkadgdsk 
gseevdshckkalshkelyerarellvsyeeeqptvlekfrylp 

KAIKAWNNPSPRVECVtAELKGVTCENREAVLDAPLDDGFLVPT 

feqlaalqieyeenvdi^ndvlvpkpfsqfwqpllrglhsqnftq 
allermlselpaiigisgirpryilrwtvelivantktgrnarrf 
sagqwearrgwrlfncsasldwprmvescligspcwaspqllrxi 
f\kamgqglqde\ eqekllrics lytqsgenslvqegseas pig 
kspytldstiywsvkpasssfgseakaqqqeeqgsvndvkeeeke 
ekevlpdqveeeeenddqeeeeededdeddeeedrmevgpfstg 
qesftaenarliiaqkrgalqgsawqvssedvrwdtfpxlgrmpr 
srprtpaelmlenydthvjfwrkpvlxeqrlepstckvtdtlgi. 

\SCGVGS\GNCSNSSSSNFRGAFLLEARGSLH\GL\KTGLQI,IV 




6016- 


13 


2237 1 ASGCAERRGXEPWELSMSWESGAGPGLGSQGMDLVWSAWYGKC 





427 



BNSDOCID: <WO 0153312A1„L> 



wo 01/53312 



PCT/USOO/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acid segmenc containing signal peptid^ 
{A^Alanine, C=:Cysteine, D^Aspartic Acid, 
Glutamic Acid, F=Phenylalanine, G-Glycine, 
H=Histidine, I-Isoleucine, K» Lysine, 
I»=Leucine, M^Methionine, N«Asparagine, 
P«Proline, Q=Glutaraine, R^Arginine, 
S«Serine, T-Threonine, V=Valine, 
WsTryptophan, Y-Tyroaine, X^Unkiowa, *«Stop 
Codon, /=possable nucleotide deletion, 
\=possible nucleotide insertion) 








ViaSKGSLPLSAHGIWAWLSRAEWDOVTVYLTCDbHKi^RYJ^ 
RITVMRSRSGNELPLAVASTADLIRCKljLDVTGGJW5TDEI*Rt»I»y 
GMALVRFVNLISERKTKFAKVPLKCIiAQEVWIPDWIVDIiRHeLT 
HKKMPKINDCRRGCYFVLDWLQKTYWCRQLENSLRETWELEEFR 
EGIEEEDQEEDKNIWDDITEQKPBPQDDGKSTESDVKADGDSK 
GSEEVDSHCKKALSHKSLYERARELLVSyEEEQPTVLEKPRYLP 
KAI KAKNNPS PRVEC VLAELKGVTCENREAVI.DAFI»DDGFI*VPT 
FEOLAM,OIEyEEWVDLNDVLVPKPFSQFtfQ?LLRGLHSQNFTQ 
ALLSRMLSELPALGISGIRPTYILRWTVELIVANTKTGRNARRP 
SAGQWEARRGWRLFNCSASLDWPRMVESCLGSPCWASPQLLRII 
F\KAMGQGLQDE\BQEKLIiRICSIYTQSGENSLVOEGSEASPIG 
KSPYTLDSLYWSVKPASSSFGSEAKACQQEEQGSVNDVKEEEKE 
EKEVLPDQVEEEEBNDDQEEEEEDEDDEDDEEfc-DRMEVGPPSTG 
QESPTAENARLLAQKRGAIjQGSAWQVSSEDVRWOTFP\LGRMPR 
SRPRTPAELMLENYDTHVIFWTKPVIi\EQRLEPSTCK\TDTLGr. 
\SC!GVGS\GNCSNSSSSNPRGAFLr>EARGSLH\GL\K?rGLQIjF 


6017 


203 


3465 


shqeieqnsamaprkrggrgisfipccprnhdhpeityrlrnIds 

NFALQ-TMEPALPMPPVEELDVMFSELVDEIiDLTDKHREAMFAIiP 
AEKKWQIYCSKKKDQEENKGATSWPEFYIPQUISMAARKStJAL 
EKEEEEERSKTlESLKTALRTKPMRFVTRFIDLDGLSCILtlFLK 
TMDYETSESRIHTSLIGCIKALMWNSOaRAHVLAHSESIWlAQ 
SLSTENIKTKVAVLEII^AVCLVPGGHKKVLQAMLHYQKYASER 
TRFQTLINDLDKSTGRYRDEVSLKTAIMSP1KAVLSQGA6VBSL 
DFRLHLRYEXFLMLGIHPVI^KLRKHENSTLDRHLDFFEMLRNE 
DELEFAKRFELVHIDTKSATQKFELTRKRLTHSEAYPHFMSim 
HCL0MPYKRSGNTVQyWI.Li:iDRIIQQIVI<2NDKGQDPDSrPi:.BN 

DAKa^EKEEMMQTIiNKMKEKLEKETTEHKQ\^QQVAE:LTAQIiHE 

LSRRAVCASIPGGPSPGAPGGPFPSSVPGSItLPPPPPPPLPGGM i 

r*PPPPPPLPPGGPPPPPGPPPLGAIMPPPGAPMGLAI*KiaCS2PQ 

PTNALKSFNWSKLPENKLEGTVWTEIDDTKVFICIIiDLEDLERTP 

SAYQRQQDFFVNSNSKQKEADAIDDTLSSK1,KVKELSVIDGRRA 

QNCWIIiLSRLKLSKnaEIKRAILTMDEQSDLPKDMI*EQt)LKFVPE 

KSDIDLLEEHKHELDRMAKADRFIiFEMSRINHYQQRKJSLYFKK 

KFAERVAEVKPKVEAIRSGSEEVFRSGALKOliLEVVLAFGWrYMN 

KGQRGNAYGFKISSIiNKIADTKSSIDKNITIiLHYIjITIVEWKYP 

SVLNLNEEIiRD I PQAAKVNMTELDKE ISTIiRSGLKAVETELEYQ 

KSQPPQPGDKFVSWSQFITVASFSFSDVEDHAEAKDLFTKAV 

KHFGEBAGKIQPDEFFGIFDQPLQAVSEAKQENEMMRKKKEEBE 

RRARMEAQLKEQRE3iERKWRKAKENSEESGEFDDLVSALRSGEV 

PDKDLSKLKRNRKRITNQMTDSSRERPITKLKP 


6018 
6019 


13 
2 


2S10 
1066 


TISQSGGIRRRREAVWFEWNMDFSRLHMYSPPQCVPBNTGYTY 

AliSSSYSSDALDFETEHKLDPVFDSPRMSRRSLRliATTACTIiGD 

GEAVGADSGTSSAVSLKWRAARTTKQRRSTNKSAFSINHVSRQV 

TSSGVSYGGTVSLQDAVTRRPPVLDESWIRECyrrVDHPWGLDDD 

GDLKGGNKAAIQGNGDVGAGAATGHNGFPCSNCNMLS3RKDVLT 

AHPAAPGPVSRVYSRDRWQKCDDCKGKRHLDAHPGRAGTLWHIW 

ACAGYFLLQILRRIGAVGQAVSRTAWSALWLAWAPGKAASGVF 

WWI^IGWYQFVTLISWIiNVPLLTRCLRNICKFl.VI*Lr?LFLLLG 

IiSLRGQG\KFFS PLPVLNWASMHRTQRVDDPQDVFKPTTSRLKQ 

PLQGDSEAFPWHWMSGVEQQVASLSGQCHHHGENLREIiTTLLQK 

LQARVrXiMEGGAAGPSASVRDAVGQPPRETDFMAPHQEHEVRMS 

HhEDlLGKLREKSEAIQKELEQTKQKTlSAVGEQLhPTVEHLQL 

ELDQLKSELSSWRHVKTGCETVDAVQERVDVQVREMVKIJLFSED 

QQGGSLEQLLQRFSSQPVSKGDLQTMLRDLQI>QILRNVTHHVSV 

TKQIiPTSEAWSAVSEAGASGITEAQARAIWSAIiKI»YSQDKTG 

MVDFAIiESGGGSILSTRCSETYETKTALMSLFGIPLWYFSQSPR 

WIQPDIYPGMCWAFKGSQGYLWRLSMMIHPAAFTLEHIPKTIi 

SPTGNISSAPKDPAVYGLENEYQEEGQr^IiGQFTYDQDGESLQMF 

QALKRPDDTAPQIVELRIFSNWGHPEYTCLYRPRVHGEPVK 

TPNDREPPPQRPPSSRRASHIAQEITSAASLGDQTQIi,GSi;TTA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
CO rre spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acad segment containing signal peptld^ 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F»Phenylalanine, G=Glycine, 
H?=Hxs t idine , I«»Icoleucine , K»Lysine , 
L=Leucine, M^Methionine, N=Asparagine, 
P«Proline, Q=Glutamine, RsArginine, 
S=Serine, T=Threonine, VsValine* 
W^Tryptophan, y=:Tyrosine, X^Unknown, *^Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6020 






PVITSArRaMPGISSQILTNAQGKiViGTLPWVVNSASX^APAPA- 

QSLQVQAVTPQLr,LMAQGQVIATLASSPI>PPPVAVRK\PSTPES 

LLKSBVQPrKPTPTVPQPAWIASPAPAAKPSASAPrPITCSET 

PTVSQt,VSKPHTPSLDEDGINX,EEXREFAKNFKIRRLSLGLTQT 

QVGOALTATEGPAYSQSAICRFEFCLDITPKSAQKLKPVLEKWIJJ 

EAELRNQEGQQNLMBFVGGEPSKKRKRRTSPTPQAIEALNAYFE 

KNPLPTGQBITEIAKELWYDREVVRVWFCNRRQTLKNTSKIJI^ 


6021 


4 953 


549 


EAIQFEVS iGNYGNKFDrX'CKPXASTTQYSRAVFixSNYYrniPW" 

AHTKPVVTLTSYWEDISHRLDAVNTLLAMAERU2TNIEALKSG1 

QGKXPANQLAEIiWriKLIDEVIEDTRYTLPLTKGKANVTVLDTQI 

RKLRSRSLSQIHEAAVRMRSEATDVKSTLAEIEDWLDKLMQLTE 

EPQNSKPDIIIWMXRGEKRLAYARrPAHQVLYSTSGENASGKYC 

GKTQTIFLKypQEKKNGPKVPVELRVNIWLGt.SAVEKlCFNSFAE 

GTFTVFAEMYENQALMFGKWGTSGLVGRHKPSDVTGKIKLKREP 

FLPPKGWEWEGEWlVDPERSLIiTEADAGHTEFTDEVyQNBSRYP 

GGDWKPAEDTYTDANGDKAASPSELTCPPGWEWEDDAWSYDINR 

AVDEKGWEYGITIPPDHKPKSWVAAEKMYHTHRRRRLVRKRKKD 

LTQTASSTAGAMEELQPQEGWEYASLIGWKFHWKQRSSDTPRRR 

RWRRKMAPSETHGaAAIFKLEGALGADTTEDGDEKSLEKQKHSA 

TTVFGANTPIVSCK-FDRDYIYHLRCYVYQARNLtAnDKDSFSDP 

YAHICFI^RSKTTEIIHSTLNPTWDQTIIFDEVEIYGEPQTVLQ 

NPPKVIMELFDNDQVGKDEFLGRSIFSPWKLNSEMDITPKLLW 

HPVMNGDKACGDVLVTAELILRGKDQSNbPILPPQRAPMLYMVP 

QGIRPWQLTAIEILAWGLRNMKKFQMASITSPSLWECGGBRV 

ESWIKNLiaCTPNFPSSVLFMKVFI.PlCEEr.YMPPLVIICVIDMRQ 

FaRKPWGOCTIERr,DRFRCX>PYAGKEDIVPQLKASLLSAPPCR 

CIVIEMEDTKPl^IASKCLSSMSTALSKMASPATVHLTEKBBEIV 

DWWSKFYASSGEHEKCGQYIQKGYSKLKIYNCELEMVAEPEGLT ^ 

DFSDTFKLYRGKSDENEDPSWGEFKGSFRIYPLPODPSVPAPP 

RQFREI^PDSVPQECTVRIYIVRGLEIiQPQDNNGLCDPYIKITtiG 

KICVIE\DRDHyiPNTLNPVFGRMYEI*SCYIiPOEKDI.KXSVYDYD 

tftrdek:vgetiidi.enpf\lsrpg\shcg\ipeeyc::vsgvntw 
rdslrNptqNllqnvarfkgfpqpilsedgsriryggrdyslde 

FEANKILHQHI/5APEERIALHILRTQGr.VPEHVETRrtHSTPQP 
N1S\RYYLRVI ItsrNTKDVILDBKSITGEEMSDI YVKGWI PGNEE 

nkqktdvhyrsldgegkfiwrpvfpfdylpaeqlcivakkehfw 

SIDQTEFRIPPR\LIIQ1W\DNDKFS\LDDYLGFPRTLTCRHTI 
HFI^KSPGGNC/RGLDMIPDLKAMNPLKAKTASLFBQKSMKQWW 

pcyaekdgarwjagkvewtleilnekeadbrpagkgrdepnmwp 

KUDLPNRPETSFIiWFTNPCKTMKFIWfRRFKWVIIGl.t.FLLlIJ:. 
LFVAVLIiYSLPKYLSMKIVKPNV 




4953 


549 

] 
i 
C 
I 
I 


EAIQFEVSIeNYGNKFDTrCKPlASTTQYSRAVFl)GNYYYYi:PW~' 

AZ-rrKPVVTLTSYWEDISHRLDAVNTLIAMAERLQTWXEALKSGI 

QGKlPAWQIjySLWLKLrDBVIEini?YTLPLTEGKAEmvriDTQI 

RtCLRSRSLSQIHEAAVRKRSEATDVKSTLAEIEDWLDKU^LTE 

EPQNSMPDXIIWMIRGEKRXAYARIPAHQVLYSTSGENASGKYC 

GKTQTIFLKYPQEKNNQPKVPVELRVNIWIX3LSAVEKKFNSFAE 

GTFTVFAEMYEKQALMFGKWGTSGLVGRHKFSDVTGKIKLKREF 

FUPPKGWEHEGEWIVDPERSLLTEADAGHTEFTDEVYQNESRYP 

GGDWKPAEDTYTDAKGDKAASPSELTCPPGWE»EDDAWSYDINR 

ATOSKGWEYGITXPPDHKPKSWVAAEKMYHTHRRRRLVRKRKKD 

LTQTASSTAGAKEELQDQEGWEYASLXGWKFHWKQRSSDTPRRR 

RWRRKMAPSETHGAAAIPKLEGALGADTTEDGDEKSLEKQKHSA 

mrFGANTPIVSa^FDRDyiYHLRCYVYQARm.tA^DKr>SFSDP 

^AHICFLHRSKTTEXIHSTLNPrtTOTXIFDEVEIYGEPQTVLQ 

^PPKVIMELFDNDQVGKDEFI/SRSIFSPWKLMSEMDITPKLLW 

iPVMNGDKACGDVLVTAEIilLRGKDGSWLPXLPPQRAPmiYMVP 

2girpwqltaiexlawgr.rnmknpqmasitspslvvecggerv 
sswiknlkktpnfpssvlfmkvflpkeelympplvikvidhr^ 
;;grkpwgoctierijdrfrcdpyagkedivpolkasllsappcr 
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SEQ 
ID 
NO: 


PredicteH 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
CO r re spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide" 
(A=Alanine, c=^Cysteine, D«Aspartic Acid. 
Glutamic Acid. F=PhenylaXanine. G-Glycine, 
H=Histidine, i=Isoleucine, KsLysine. 
L=Leucine . M=*Methioni ne , N-Asparagine , 
P=Prolane. Q=Gluhamine, R*=Arginine. 
S=Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y-Tyrosine» X=Unknown, *=3top 
Codon. /-possible nucleotide deletion, 
\ "possible nucleotide insertion/ 








I)IVIEMEDTKPDLASKCljSSMSTAr,SKMASPATVHLTEKEEElv~ 

DWWSKFYASSGEHEKCXJQyiQKGySKLKlYNCELENVAEPEdfLT 

DFSDTFKLYRGKSDENEDPSWGEFKGSFRIYPLPDDPSVPAPP 

RQPRELPDSVPQECTVRIYIVRGLBLQPQDNNGLCDPYIKITLG 

KKyiE\DRDHYIPNTLNPVFGRMYELSCyr«PQEKDIjKISVYDYD 

TFTRDEKVGBTIII}IiBKPF\l,SRFG\SHCG\lI»EEYCVSGVNTW 

RDSLR\PTQ\LLQNVARFKGFPQPILSEDGSRIRYGGRDYSIiDE 

FEANKIIiHQHt/SAPEERLALHILRTQGLVPEHVETRTLHSTFXJP 

K1S\RYYLRVIIWKTKDVILDEKSITGEBMSDIYVKGWIPGNEE 

NKQKTDVHYRSLDGEGNFNWRFVFPPDYLPAEQLCIVAKKEHPW 

SIDQTEFRIPPR\LIIQIW\DNDKFS\LDDYLGPPRTt.TCRHTI 

HPLQICSPGGNC/RGLDMIPDLKAMNPI.KAKTASI,FEQKSMKGWW 

PCYAEKDGARVMAGKVEMTLEILNEKBADERPAGKGRDEPNMNP 

KLDLPNRPETSFLWPTNPCKTMKFIVWRRFKWVIIGtLFLIiILI. 

LFVAVLLYSLPNYLSMKIVKPKV 


6022 


4953 


549 


EAIQFEVSaGNTGNKFDTTCKPLASTTQYSRAVtl>6liyYYYl.PW 
AHTKPVVTLTSYWEDISHRLDAVNTLLAMAERLQTWrEALKSGI 
QGKIPANOlAEl^WLKLlDEVIEDTRyTIiPLTEGKANVTVIJDTQI 
RKLRSRSLSQIHEAAVRMRSEATDVKSTLAEIEDWLDKLMQLTE 
EPQNSMPDI I IWMIRGEKRIAYARIPAHQVLYSTSGBNASGKYC 
GKTQTIFLKYPQEKNNGPKVPVELRVNIWLGLSAVEKKFNSFAE 
GTFTVPAEMYENQALMFGKWGTSGLVGRHKPSDVTGKIKLKREF 
PLPPKGWEWEGEWIVDPERSLLTEADAGHTEFTDBVYQNESRYP 
GGDWKPAEDTYTDANGDKAASPSELTCPPGWEWEDDAWSYDINR 
AVDEKGWEYGITIPPDHKPKSWVAAEKMYHTHRRRRLVRKRKKD 
LTQTASSTAGAMEELQDQEGWEYASLIGWKFHWKQRSSDTFIiRR 
RWRRKMTVPSETHGAAAIFKLEGALGADTTEDGDEKSLEKQKHSA 
TTVPGANTPIVSCNPDRDyiYHLRCYVYQARNLXAriDKDSFSDP 
YAHI CFLHRSKTTEI IHSTLKPTWDQTI IFDEVEI YGEPQTVLQ 
NPPKVIMEIiFDNDQVGKDEFLGRSI FS PWKLNSEMDITPICI,LW 
HPVMNGDKACGPVLVTAELILRGKDGSWLPILPPQRAPNLYM^/P 
QGIRPWQLTAIEILAWGZ.RWMKNFQMAfllTSPSLVVEOGGERV 
ESWIKWJjKKTPNFPSSVLFKKVPLPKEELYMPPLVIKVIDHRQ 
KGRKPWGQCTIERLDRFRCDPyAGKEDIVPQLKASLt.SAPPCR 
DIVIEMEDTKPLIASKCLSSMSTALSKMASPATVHLTSKEBEIV 
DWWSKPYASSGEHEKCGQYIQKGYSKLKIYNCEIiENVAEFEQLT 
DFSDTFKLYRGKSD2NEDPSWGEFKGSPRiypLPDD?SVPAPP 
RQFREIiPDSVPQECTVRIYIVRGtiEriQPQDNNGIiCDPYIKITIiG 
KKVlEXDRDHYIPjrriiNPVFGRMYEI/SCYLPQEKDbKISVYDYD 

tftrdekvgetiid::iEnpp\lshfg\shcg\ipeeycvsgvntw 
RDSLR \ ptqXllqnvarfkgfpqpilsedgsriryggrdyslde 
feawkii1hqhi/3apeerlalhilrtqglvpehvetrti.hstpqp 
nrs\ryylrviiwntkdvii*deksitgee^3sdiyvkgwlpgnee 
nkqktdvhyrsldgegnptmfvpppdyiipaeqijcivakkehfw 
sidqtefripprXliiqiwXdndkfsXlddylgfprtltcrhti 

HFLQKSPGGNC/RGLDMIPDLKAMNPIjKAKTASLFEQKSMKGWW 
PCYAEKDGAR\/l<AGK\JTMTriE I Lbffi KieADFn pa^Tf R n 

KLDXjPNRPETSFLWFTNPCKTMKFIVWRRFKWVIIGLLFLLILL 
LFVAVLIiYSLPNYLSMKIVKPNV 




6023 


102 


316 


sqei/3Mfvei*nnllnttpdraeqgkltllcdaktix5Sflvhhfl ■ 

sfylkanckvcfvaliqsfshysivgqklgvsltmarergqlvf 

legii/ivcsgr\vfqaqkephplqplreanagniikpripefvrea 

IiKPVDSGEARWTYPVLIiVDDIiSVXiLSLGMGAVAVIiDFIHYCRAT 
VCWELKGNMWLVHDSGDAEDEEXroXDIiNGLSHQSHLILRAEGL 
ATGFCRDVHGQLRI LWRRPSQPAVHRDQS FTYQYKIQDKS VSFF 
AKGMSPAVL 




6024 


3 


3260 


FI*SFLCYPRFRCLFCLQFAIPASRMEQIiNEI.ELIJMEKSFWEEAB 
LPAELFQKKVVASFPRTVLSTG^IDNRYrJVIIAVNTVQNKEGNCEK 
RLVITASQSIiBNKELCIIiRNDWCSVPVEPGDI IHLEGDCTSDTtf 
IIDICDFGYLIi:*YPDMLISGTSIASSIRCMRRAVI,SETFRSSDPA 
TRQMLIGTVIiHEVFQKAINrfSFAPEKLQELAFQTiaEIRHLKEM 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
correoponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal pepti^^ 
{A^Alanine, C=Cy3teine, D=Aspartic Acid, 
Glutamic Acid, F=.Phenylalanine, G=Glycine, 
n=!llistidinc, I^Isoleucine, K^hyslne, 
L=Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q-Glutamine, R=Arginine, 
S -Serine, T=Threonine, V*=Valine, 
W-Tryptophan, Y^Tyrosine, X-UnJciown, *^Stop 
Codon, /^possible nucleotide deletion, 
\*=possible nucleotide insertion) 










YRLNI^QDKiKQEVEDYLPbFCKWAGDF^mKi^^TSTPFPQMQL6I^ 

PSDNSKDNSTCNIEWXPMDIEESlwrSPRFGLKGKIDVTVGVkl 

HRGyKTKYKIMPLELKTGKESNSIEHRSQWLyTLLSOERRADP 

EAGLLLYLKTGQMYPVPANHLDKRELLKLRHQMAFSLPHRISKS 

ATRQKTQIASLPQI lEEEKTCKYCSQIGNCALYSRAVEQQMDCS 

£» V f J. vru»i^Ki£,fc.£rrQHI*KQTHLEyFSLWCIiMLrLESQSKDNKKN 

HQMrWLMPASEMEKSGSCIGNLIRMEHVKtVCDGQYI^HNFXKrKH 

OArPVaWL^IAGDRVIVSGEERSLFALSRGYVKEIN^^"^VTCLLD 

RNliSVLPESTLFRLDQEEKNCDIDTPIjGNLSKLMENTFVSKKLR 

DLIIDFREPOFrSYI.SSVLPHDAKDTVACILKGLNKPQRQAMKK 

VLLSKDYTLIVGMPGTGKTTTICTLVRILYACGPSVI>LTSyTHS 

AVDNXLLKIAKFKIGFLRSR\QIQKVHPAIQQFTEHEICRSKSI 

KS\LALLEELYTSQI,IDATTCMGIMHPIFSRKIFDFCIVDEASQ 

ISQPrCIiGPLFFSRRFVLVGDHQQLPPl,Vl^EARALGMSESLP 

KaLEQNKSAWQt,TVCiYRMNSKIMSLSNia.TYEGKLECGSDKVA 

NAVINLRHFKDVKLELEFYADYSDNPWLMGVFEPKNPVCFLNTD 

KVPAPEQVEKGGVSNVTEAKIiIVFl*TSIFVKAGCSPSDIGIIAP 

YRQQIiKIINDIJLARSIGMVEVNTVDKyouXRDKSIVLVSFVRSW 

KIXSTVGBIiLKDWRRTJNfVArTRAKHKLIIJ^CVPSLNCypPLEKL 

tiNKIiNSEKLI IDLPSREHESLCHILGDFQRE 




6025 


3977 


89 


GGFPAQSDHbPPVFPLRSDr,LITMSTI,yVSPHPDAFPSLRALIA 
ARYGEAGEGPGWGGAHPRICLQPPPTSRTSFPPPRLPALBQGPG 
GLWVWGATAVAQLIiWPAGLGGPGGSRAAVLVQQWVSYADTELIP 
AACGATLPALGIiRSSAQDPQAVLGALGRALSPLEEWtiRLHTYlA 
GEAPTlADIJU^VTAIjIiLPFRyVLDPPARRIWWNVTRWFVTCVRQ 
PEFRAVI/SEWXySSARPLSHQPGPEAPALPKTAAQLKKBAKfCR 
EKLEKFQQKQKlQQQQPPPGEICKPKPE:<REKRDPGVITyDI.PTP 
PGEKKDVSGPMPDSySPRyVEAAWYPWWEQQGFFKPEYGRPNVS 
AANPRGVFMMCIPPPNVTGSLHLGHALTMAIQDSLTRWHRMRGE 
Trt,WNPGCDHAGlATQVWEK:a.WREQGIiSRHQLGREAFr*QEVW 
KMKBEKGDRIYHQLKKI/3oi>ijDWDRACFTMDPKLSAAVTEAFVR 
LHEEGIIYRSTRLVNWSCTLNSAISDIEVDKKELTGRTLIiSV^ 
YKEKVEFGVLVSPAYKVQGSDSDEEVWATTRIETMLGDVAVAV 
HPKDTRYQHLKGKNVIHPFLSRSLPIVPDBFVDMDFGrGAVKrr 
PAHDONDYEVGQRHGLEAISIMOSRGALINVPPPFLGLPRPEAR 

OGEMAQAASAAVTRGDLRII,PERHQRTWHAWMDNIRE\WCMFPG 
iCLWWGXHRMPAYFVrVSDPAVPPGEDPDGRYWVSGRNBABARE 
KAAKEFGVSPDK1SLQQDEDVLDTWFSSGLFPLSI1X3MPN0SED 
LSVFYPGTLI^TGHDir^FWVARMVMLGLKL'IGRLPFREVyi.HA 

ivrdahgrkmskslgnvidpldviygislqglhnqllnsnij:>ps 

EVEKAKEGQKADFPAGIPECGTDALRFGLCAYMSQGRDINLDVM 
RrtiGYRHFCNKLWNATKFALRGLGKGFVPSPTSQPGGHESLVDR 
WIRSRLTEAVRi:,SNOGPQAyDFPAVTTAQySFWi:.YELCDVyi,EC 
LKPVLNGVDQVAAECARQTLYTCLDVGLRLLSPFMPFVTEELFQ 
RLPRRMPQAPPSLCVTPyPEPSECSWKDPEABAALELALSITRA 
VRP\LRADyNLHPESGPTCFLEVAD\EATGAIASAVSGYVQGPG 
QAQWVAVAEPWGI,PAP\QGCAVAtASDRCSI\HIiQLQG\LI.DP 
AREl,G\KLQ\AKRVEAQ\RQAQ\RIJi\ERRA\ASGNPVKVPI,\E 
VQEADEAKIiQQTEAEIjRKVDEAIAI.PQKML 


6026 


2674 


S14 

] 

] 
£ 


CtPITFLKKKAKMKDMPLRIHVLLGIAITTLVQAVDKKVDCPRLC 

TCBIRPWFTPRSiyMEASTVDCNDLaLLTFPARLPANTQILLI.Q 

rWWIAKIEYSTOFPVWLTGrJDLSQNNLSSVTNXMGKKMPQLtiSV 

SfLEENKLTELPEKCLSEJUSNIiQELyrNHNLLSTISPGAFIGLHN 

C.LRLHLNSNRLQMINS KWFDALPNLEILMIGEMPI IRIKDMNPK 

PLINLRSLVIAGINLTErPDNALVGLENLESISPyDNRI,IKVPH 

^QKVVNLKFl^I^KNPINRIRRGDFSNMLHLKEIiGlNNMPEL 

CS IDSIiAVDNLPDLUKIEATNNPRLSYIHPNAFFRLPKLESU^L 

JSWALSALYHGTIESLPNLKEISIHSNP IRCt)CVIRWMNMNKTN 

CRFMEPDSLFCVDPPEFQG^^^QVHFRDMMEICLPLIAPESF?^ 

;NLNVEAGSYVSFHCRATA\EPQPEIYWITPSGQKrj:»PNT\l,TD 
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ID 
NO : 


Predicted 
beginning 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acad. segment containing signal peptTS^ 
(A^Alanine, C^Cysteine, P-Aspartic Acid, E= 
Glutamic Acid, P=Phenylalanine, G=Glycine, 
H^Histidine, I«Isoleucine, K^Lysine, 
L=Leucine, M==Methionine, Kt=Asparagine, 
P»Proline, Q^Glutamine, R^Arginine, 
SsSerine, T=Threonine, V= Valine, 
W=»Tryptophan, Y=Tyrosine, X^Unknown, *-stop 
Codon, /^possible nucleotide deletion, ' 
\i=possible nucleotide insertion) 








KFYVHSEGTI^DIMGVTPKECjGLYTCIATNLVGAULK^VMlKVDG " 
SFPQDNNGSLNIKIRDIQANSVLVSWKASSKILKSSVKWTA^VK 
TENSHAAQSARIPSDVKVYNLTHLNPSTEYKICIDIPTIYQKNR 
KKCVNVTTKGUIPDQKEYEKNNTTTLMACU3GLLGI IGVICLI S 

CLSPEMNCDGGHSYVRNYLQKPTFALGELypPLINLWEAGKEKS 
TSLKVKATVIGLPTNMS 


6027 


52S4 


4148 


UUKKAPGRPGRSllCDKEKETVFREVVSFSPDPLPVRYYDKDTTK 

P ISFYLSSLEELLAWKPRLEDGFNVAIiEPLACRQPPLSSQRPRT 

LLCHEMMGGYLDDRFIQGSWQTPYAFYHWQCIDVFVYFSHHTV 

TIPPVGWTNTAHRHGVCVLGTFITEWNEGGRLCEAPLAGDERSY 

QAVADRIiVQIT\RFFRFDGWLIMIENSLSLAAVGNMPPFLRYl.T 

TQLHRQVPGGLVLWYDSWQSGQI.KKQDELMQHNRVPPDSCr)GP 

FTNYNKREEHLERMLGQAGERRADVYVGVDVPARGNWGGRFDT 

JDKVGGGFRPRASGPVPPLGPHFLMDLPFPSAPQRNDSSCSSOSG 
DPVALRNKCPAPAKI.CPH 


£028 


120 


3432 


NCLLLQAKGFHGEIEDIiQQWLTPTERHLIJ^KPLGGLPEt^^ 

XiNVHMEVCT^EAKEETYKSLMQKGQQMLARCPKSAET^ 

NNLKEKWESVETKX>NER\KT\KLEEALNI*A\MEFHNSL\QDFIN 

WLTQAEQTLWVASRPSLILDTVI.PQIDEHKVPANEVNSHREQII 

EliDKTGTHLKYPSQKQDVVLIKNI^LISVQSRWEKVVQRLVERGR 

SU)DARKRAKQFHEAWSKLMEWLEESEKS1I)SELEIANI>PDKIK 

TQIAQHKEFQK:sriGAKHSVYDTTKRTGRSLKEKTSIJU>DHl,KLD 

DMLS3Lf!DKWDTICGKSVERQNKLEEA\H.FSGQFTI)ALQALID 

WLYRVEPQIAEDQPVHGDIDLVhan^IDNHKAFQKELGKRTSSVQ 

ALKRSAREIilEGSRDDSSWVKVQMQELSTRWETVGALSISKQTR 

ItEAAuRQAEEFHSWHALLEWLAEAEQTLRFHGVIiPDDEDALRT 

LIDQHKEFMKKIiEEKRAELOTCATTMGDTVLAI CHPDS ITTI KHW 

ITIIRARFEEVIAWAKQHCK3RIASALAGL1AKQEIJ:.KALIA5ILQ 

KAETTLXDKDKEVIPQEIEEVKALIAEHQTPMEEMTRKOPDVDK ' ^ 

VTKTYKRRAADPSSLQSHrPVLDKGRAGRKRFPASSLYPSGSQT 

QIETKNPRVNLI.VSKWQQVWIJLALERRRKLNDALDRI*EEIiREFA 

NFDFDIWRKKYMRWMNHKKSRVMDFFRRJDKDQDGKITRQEFID 

GI LSSKFPTSRLEMSAVADI FDRDGDG YID YYEFVAALHPNKDA 

YKPITDADKIBDEVTROVAKCKCAKRPQVEQIGDNICYRFPLGDIQ 

FGDSQQLRLVRILRSTVMVRVGGGWMALDEFLVKNDPCRAKGRT 

NKBLREKFIIADGASQGMAAFRPRGRRSRPSSRGASPNRSTSVS 

SQAAOAASPQVPATTTPKILHPLTRNYGKPWLTNSKMSTPCKAA 

ECSDFPVPSAEGTPIQGSKLRLPGYLSGKQFHSGEDSGIiITTAA 

ARVRTQFADSKKTPSRPGSRAGSKAGSRASSRRGSDASDFDISE 

IQSVCSDVETVPQTHRPTPRAGSRPSTAKPSKIPTPQRKSPASK 
LDKSSKR 


6029 


1 


3533 

: 


IMPCGSSRI^LRGCWTHPNEPVSOLSYFDCIESVMENSFCVI^ESM 
AGISQNAKTGDIiPAFGECVGIASKALCGLTBAAAQAAYLVGIFD 
PNSQAGHQGLTOPIQFARANQAIQMACQNLVDPGSSPSQVI^AA 
TIVAKHTSALCNACRXASSKTANPVAKRHFVQSAKEVANSTANIi 
VKTIKALDGDPSEDNRNKCRIATAPLIEAVENLTAFASNPEPVS 
IPAQISSEGSQAQEPILVSAKPMLESSSYlilRTARSLAINPKDP 
PIWSVLAGHSHTVS DS T K T. TT<I TT?n ira I>rv^nr»/^vc^ T rirt -r 

IROIEQASLAAVSQSLATRDDISVEALQEQLTSWQEIGHLIDP 

lATAARGEAAQLGHKGTQLASYFEPLILAAVGVASKILDHQQQM 

TVLDQTKTLAESALQMLYAAKEGGGNPKAQHTHDAlTEAAQItMK 

EAVDDIMVTLNEAASEVGLVGGMVDAIAEAMSKLDEOTPPEPKG 

TFVDYQTTWKYSKAIAVTAQEMOTKSVTNPEEIXSGliASQI^SD 

YGHLAFQGQMAAATAEPEEIGFQIRTRVQDIiGHGCI FLVQKAG\ 

AUQVCPTDSYTKRELI ECARAVTEKVSLVLSALQAGNKGrQACI 

rAATAVSGIIADLDTTIMFATAGTLNAENSETPADHREKILKTA 

ECAI.VE]yrKI*LVSGAASTPDKLAQAAQSSAATITQlAEVVKLGAA 

SI^SDDPETQWLIKAIKDVAKALSDLISATKGAASKPVDDPSM 

fQLKGTUVKVMVTNVTSLIiKTVKAVEDEATRGT^ 

3LTVFQSKDVPEKTSS PEES IRMTKGITMATAKAVAAGNSCRQi 

^VIATANLSRKAVSDi^TACKQASFHPDVSDEVRTRAIiRFGTEC 
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ID 
NO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
simino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



€032 



6033 



39 



1694 



Amino acid segment containing signal peptide" 
tAsAlanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, P=Phenylalanine, G«Glycine, 
HsHietidine, I»Isoleucine^ KsLysine, 
L«Leucine, M=Methionine , M-Asparagine, 
P=»Proline, 0=Glutamine, R^Arginine, 
S -Serine , T=Threonine , V- Val ine , 
W=.rryptophan, Y=*Tyrosine, X=Unknown, *x:Stop , 
Codon, /=possible nucleotide deletion, 
\apossible nucleotide insertion) 



■i'LGYLDLLEHVLVlLQKPTPELKQQLAAFSKRVAGAVTELXQAA" 
EAMKGTEWVDPEDPTVIAETELIiGAAASIEAAAKKLEQLKPRffK 
PKQADETLDFBBQILEAAKSIAAATSALVKSASAAQRELVAQGK 
VGS IPANAADDGQWSQGLISAARMVAAATSSLCEAANASVQGHA 
SEEKLISSAKQVAASTAQLLVACKVKADQDSEAMRRLQAAGNAV 
KRASDNLVRAAQKAAFGKADDDDVWKTKFVGGIAQI lAAQEBM 
LKK3RELEEARKKLAQ I RQQQYKPLPTELREDEG 



FPGRGSPALQLEVLICLGLMGLEHALNVIjAPIPyRNIVNIiLTEN 
APWNSLAWTVTSYVFLKFLQGGGTGSTGFVSNIilTFLWlRVQQr 
TSRRVEI,IjIFSHr*HEI,SLRWHIiGRRTGEVr,RIADRGTSSVTGLI* 
SYLVFNVIPTLADIIIGIIYFSMFFNAWFGLIVFLCMSLYLTLT 
IWTEWRTKFR3WayitrrQENATRAIiA\mSLIiNFETVKYYNAESYE 
VERYRBAI IKYQGLEWKSSASLVLXJJQTQNLVIGUSLXAGSLLC 
AYPVTEQKLQVGDYVLFGTYIIQLYMPLNWFGTYYRMIQTNPID 
MENMPDLI,KK\3TEVKDLPGAGPPRFQKGRIEFENVHFSyADGR 
BTLQDVSFTVMPGQTUVLVGPSGAGKSTIIJaibPRPYDlsSGCI 
RIDGQDISQVTQALFRPSHWELCPKDTVLPNDriADNIRYGRVT 
AGNDEVEAAAQAAGIHDAIMAFPEGYRTQVGERGLKLSGGEiCQR 
VAIARTILKAPGIILLDEATSAIiDTSNERAIOASLAKVCANRTT 
IWAHRLSTWNADOIXtVIKDGCIVERGRHEALLSRGGVYADMW 
QLQQGQBETSEDTKPQTMER 



LRMSENU3KSNVNEAGKSKSNDSEEGI,EDAVEGADEAI*QKAIKS 
DSSSPQRVQRPHSSPPRFVTVEELLETARGVTWMACAHEXVVNG 
DPQlKPVELPENSLKKRVKEIVHKAFWDCLSVQIiSEDPPAYDHA 
IKLVGEIKETliLSFLLPGHTRIiRKQXTEVLDLDIiIKQEAENGAL 
DISKLAEFIJGMMGTLCAPARDEBVKKLKDIKBXVPtiFRElFSV 
LDLMKVDMANFAISSIRPHIiMCKJSVEYERKKPQEILERQPNSLD 
FVTQWLEEASEDLMTQKYKHAIiPVGGMAAGSGDMpRr^PVAVQN 
yAYLICI,LKWDHLQRPFPETVIiMDQSRFHELQLQ\REQr,TILGAV 
LLVTPSMAAPGISSQADFAEKLKMIVKIt.LTDMHLPSPHLKr)VL 
TTIGEKVCXEVSSCLSLCGSSPFTTDKETVLKGOIQAVASPDDP 
IRRIMESRIJCTFLETyLASGHQKPLFTVPGGIiSPVQREJCBEVAI 
KFARLVNYKKMVFCPYYPAILSKILVRS 



AARLCRAQPTKSAWMIRDLSKMYPQTRHPAPHQPAQPFKPXISE 

SOTRIKEEFQPI^AQYHSLKLECEKIASEKTEMQRHYVMYYEMS 

YGLMIEMHKQAEIVKRLNAICAQVIPPLSQEHQQQWQAVERAK 

QVTMAELMAIIGQQQI>QAQHLSHGHGl4PVPLTPHPSGLQPPAIP 

PIGSSAGLLALSSALGGQSHLPIKDEKKHHDNDHQRDRDSIKSS 

SVSPSASPRGAEKHRNSADYSSESKKQKTEEKEIAARYPSDGEK 

SDDNLWDVSNEDPSSPRGSPAHSPRENGLDICrRLLKKDAPISP 

ASIASSSSTPSSKSKEIiSIJIEKSTTPVSKSNTPTPRTDAPTPGS 

KSTPGLRPVPGKPPGVDPIASSIiRTPMAVPCPYPTPFGIVPHAG 

MNGELTSPGAAYAGIiHNISPQMSAAAAAAAAAAAYGRSPWGFD 

PHHHMRVPAIPPNLTGIPGGKPAYSFHVSADGQMQPVPPPPDAL 

IGPGIPRHARQINTLNHGEWCTVVTISNPTRHVYTGGKGCVKVW 

DISHPGNKSPVSQ):j>aLNRDNYIRSCIlliIiPDGRTI.IVGGEASTL 

srWDLAAPTPRIKAELTSSAPACYALAISPDSKVCFSCCSDGNI 

AVWDLHNQTLVRQFQGHTDGASClDISNDGTBOiWTGGLDNTVRS 

W\DLREGRQLQQHD/FFTSPVFSIiGyCP\TBEWIiAVGMENSN\V 

EVWm'KPDKyQt*HI.HESC\n:.SLKFAHCX5KWF\VSTGKDNLLNA 

W\RTPYG\ASIF\QSKESSS\VLSCDI\SVDDKYIVTGS\GDK\ 
RATVYEVIY 



AARLCRAQPTKSAWMIRDLSKMYPQTRHPAPHOPAQPFKFTISE 
SCDRIKEEFQPLQAQYHSLKLECEKIASEKTEMQRHYVMYYEMS 
YGLNIEMHKQAEIVjEtRI^ICAQVrPFIiSQEHQQQVVQAVERAK 
QVrMAELNAI IGQQQliQAQHZ^SHGHGLPVPLTPHPSGLQPPAlP 
PIGSSAGLLALSSALGGQSHLPIKDEKKHHDNDHQRDRDSIKSS 
SVSPSASFRGAEKKRNSADYSSESKKQKTEEK3IAARYDSDGEK 
SDDNLWPVSNEDPSSPRGSPAHSPRENGLDKTRLLKKDAPISP 
ASIASSSSTPSSKSKELSLMEKSTTPVSKSNTPTPRTDAPTPGdf 
NSTPGLRPVPGKPPGVDPlASSLRTPMAVPCPYPTPFGtVPHAG 
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SEQ 
ID 
NO: 


Predicted 
becfinnlng 
nucleotide 
location 
ocrresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


f redicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno aciO segment contaxnxng signal IpiptXHe- 
(A=Alanxne, C=:Cysteine, D^Aspartic AcidT 
S "^^r^^-^^^^' F-Phenylalanine, G=Glycine, 
H-Hastidxnc, I-l3oleucine , K:=Iiysine 
L^Leucine, M=.Methionine, N-Asparagine, 
P-Prolxne, Q=GIutamine R-Ar<Tin<«<k 
S==Serine, T=Threonine, Valine, 
W.Tryptophan, Y-Tyrosine, X«Unknown, *-Stop- 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertionl 




6034 






MNGELTSPQAAYAqLHNlSPQMSAAAAAAAAAAAYGRSPWGFir- 

PmniMRVPAIPPNLTGIPGGKPAySFHVSADoQMQPVPFPPDlL 

IGPGIPRHARQINTLNHGEWCAVTISNPTRHVYTGGKGCVKVW 

DISHPGNKSPVSQLDCLNRDNyXRSCRLLPDGRTLIVGGEASTL 

S IWDLAAPrPRIKAELTSSAPACYALArSPOSKVCFSCCSDGMl 

AVTOWmQTLVRQFQG:HrrDGASCIDISm)GTKLWTGGLDNTVRS 

W\DLREGRQLC3QHD/FFTSPVFSLGYCP\TrEBWLAVGMENSN\V 

EVI^KPDKYOlJlLHESCVLSLKI'AHCGKWFXvSTGKDNLmA 

^^^^^^^^"^^^^^^SSSWLSCDlXsVDDKYIVTGSXGDKt 




603S 


2683 
13 


714 


i-^>«cw«RjjjsjacKiit'CPt3iA(AiPGi:!;TNtJGPGACPRGPREEAAAAM" 
EIAPQEAPPVPGADGDIEEAPAEAGSPSPASPPADGRLKAAAKR 

^\£j^ffKXiijK.\iijjt!.if rDLGHRLDCLDLKGEKLDYKTCEAljEEVFKR 

iS!^r^^^^^°^^^^^"^^yESATHLNISFNKHIGT 

RGWQAAAHMMRKTSCLQYLNDARNTPLLDHSAPFVARALRIRSS 

lAyiiHLBNASLSGRPIrfllJlATALKMNKNIiRE^YLNADNK^ 

pSAQLGNLLKFNCSLQILDLRNNHVLDSGIAYICEGIiKEQRKGI, 

\JL\VLMNNQLrHTGMAFLGMTI>PHTQSLETLOT^GHNPIGNEGV 

RHLKNGLISNRSVI,RLGIJ\STKLTCEGAVAVAEP1AESPRLLRL 

DLREI^IKTGGLMAI^LALKVNHSLLRIJJLDREPKKEAVKSFIE 

TQKALLAEIQNGCKRNLVIAREREEKEQPPQLSASMPETTATEP 

OPDDEPAAGVQNGAPSPAPSPDSDSDSDSDGEEEEBEEGER^ET 

PSGAIDTRDTGSSEPQPPPEPPRSGPPLPNGLKPEFA1AI,PPBP 

PPGPEVKGGSCGIiEMEI^qOQTfMWTrr'r mrr T T 




6036 




404 


* V 1 Xi.(^XimKNTGALPADPVQLISQrPTPSTKOQr,x;s"FI.GMVG 
YFYLWIPGFAILTKPLCKLTKENIADAIDPKSFSHSSFRSLKTA 
LENASTI^PDSSQPF\SIJn*AEVQGnWRTT.'rnf2T/:im.m. 


1 




1745 


356 


i.VDVKKIXSRRKGRKMDSVEKGAATSVSNPRGRPSI«3RPPKU2RN ' 
SRGGQGRGVEKPPHLAALILARGGSKGIPLKNIKHIAGVPLIGW 
VLRAALDSGAFOSVWVSTDHDEIENVAKQFGAQVHRRSSEVSKD 
SSTSLDAIIEPLNYHNEVDIVGNIQATSPCLHPTDLQKVAEMIR 

EEGYDSVFS WRRHOFRWSEIOKGVRFUTPCT MT Mnn xm n^ny-^rx 
*^"'^*^-*'W^wfvtta V Aisi/jjWijWPAKRPRRQD 

WDGELYENGSFYFAKRHLIEMGYLQGGKMAyyEMRAEHSVDIDV 

DIDWPIAEQRVLRYGYFGKEKLKEIiCLLVCNIDGCLTNGHIYVS 

GDQKEIISYDVKDAIGISLLKKSGIEVRLISERACSKQTLSSLK 

LDCKMEVSVSDKIAVVDEWRKEMGLCWKEVAYLGNEVSDEBCLK 

RVGLSGAPADACSTTVQKAVGYlCKCNGGRGAMREPAEHICVLIi 
MBKGLINFMPKNRNIAVNIGEKK 


1 €038 


2936 


1919 


WTSWWMSSVLlMI,LF5IiQGNiCMLNYSAPSAGGYlJ:.PRKPVGTPA 
GGGFPRRKSVTLPSSKFRQWQLLSSLKGEPAPAI,fi qpnq w T?pnD 

SFSEGGERLLPTQKQPGGGQVNSSRYKTNELCRPFEENGACKYG 

DKCQFAHGIHELRSLTRHPKYKTELCRTFHTIGFCPyGPRCHFI 

HMAEERRALAGARDI^ADRPRLQHSFSFAGPPSAAATAAATGLL 

DSPTSITPPPILSADDLLGSPTLPDGTNNPP\APSSQELASL^A 

PSMOI.PGGGSPrrFLPRPMSESPHMPDSPPSPQDSLSDQEGYis 
SSSSSHSQ*^^^PTT.7^MCDt^T nTt:»<f t^t ^ vah.^ 
^^•.»»j*j»?«£>*jaiJo J. jbl^WiKKIiPX FSRIiS rSDD 




1450 


426 

1 

< 


bbALQEFGTRNHTFGVPLPHRRKUllSCNICQLRFNSDSQAAAH 
YKGTKHAKKLKALEAMKNKQKSVTAKDSAKTTPTSITTNTINTS 
SOKTDGTAGTPAlSTTTTVEIRKSSVMTTErTSKVEfCSPTTATG 
^SSCPSTETEEBKAKRLLVYCSLCKVAVNSASQLEAHNSGTKHK 
rMLEARKGSGTI KAFPRAGVKGKGPVNKQNTGU3NKTFHCEICD 
/HVNSETQLKQHISSRRHKDRAAGKPPKPKYSPYWKLQKTAHPL 
5VKLVFSKEPSKP1APRILPNPIUAAAAAAAAVAVSSPFSLRTAP 
^TLFOTSALPPAUJIPAPGP IRTAHTPVLFAPY 




1000 1 


.DEYEARLTLA^l.DDFEEDNEDDDENRVWQEEKAAKITjSLINKL 
IPLDEAEKDLATVNSNPFDDPDAAELNPFGDPDSEEPITETASP 
IKTEDSFYNNSYNPFKEVQTPQYI.NPFDEPEAFVTIKDSPPQST 
::<KNIRPVDMSKYLYADSSKTEBEBr)DESKPFYEPKSTPPPWNL 
WPVQELETERRVKRKAPAPPVLSPKTGVLNENTVSAGKDLSTS* 
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SEQ 
ID 
NO; 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
cor re spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 



6040 



473 



X052 



6041 



3866 



6042 



1306 



253 



Amino acid segment cohtaTning signal E>epti"Hr 
(A^Alanxne, C^Cyeteine, D==Aspartic Acid E*= 
Glutamic Acid, F=E'henylalanine, G=Glycine 
H=Hista.dine, I=Isoleucine, K^Lysine 
L-Leucine, M=Methionine, N^Asparagine, 
P=Prolane, Q^Glutamine, R=Arginine, 
S-Serine, T=Threonine, V=Valine 
M=Tryptophan, Y-Tyrosine, X-Unknown, *=stop 
Codon, /=possible nucleotide deletion, 
\ "possible nucleotide insertion) 

i-KPSPIPSPVI^KKPNASQSLLVWOCEVTKNYRGVKilWFTf^ 
RNGLSFCAILHHFRPDLIDYKSLNPQDIKEIWKKAYDGFAsibl 

SSKSrYKVGtrYETDTNSSVDQEKPYAELSDLKREPELQOPISGA 

VDFLSQDDSVPVNDSGVGESESEHQTPDDHLSPSTASPYCRRTK 
SDTEPQKSQQSSGRTSGSDDPGTP.Q>rmc»Prt5v^t,r r 




^ X ur.^ux. X VSDKKKDMS PPFI CEETDEQKLQTLDIGSNl^KEK 

LEWSRSLECRSDPESPIKKTSLSPTSKLGYSYSRDLDLAKKKIIA 

SLRQTBSDPDADRTTLNHADHSSKIVQHRLLBRQEELJCERARVI. 

IiEQARRDAALKAGNKHimJTAAPFCNRQLSDQQDEERRRQLRER 

ARQLIAEARSGGKMSELPSYGERAAEKLKERSKASGDBNDWXEI 

DTNEEIPEGFWGGGDELTNIiENDLDTPEQNSKLVDI^KLKKLLE 

VQPOVANSPSSAAQKAVTESSEQDMKSGTEDLRTERLQKTXERP 

RKPWPSKDSTVRKTQLQSFSQYIENRPEMKRQRSIQEDTKKGN 

EKICAAITETQRKPSEDEVI^I0GFKDS\SQYWGEI»AALENEQKO 

niT^^^^™^^^*^^^2EEAMMQEWFMLVNKKNALIR 

RKNOLSLt^KEHDLERR YELtiNREIxRAMLAI EDWQKTEAQKRRE 

QIJ[^E::,VALVNKRDALVkDr»mQEK(>AEEEDBHLERTI.EONKG 
KMAKKEEKCVLQ 



^x^iTAPSCAFPVQFRQ Pj.VbGLSQlTKSLYISNGVAAKNKI^ ' 

LSSNQ^TMVINVSVEWNTnYEDIQYMQVPVADSPNSRiCDFFD 

PIADHIHSVEMKQGR\lT.XaCAAGVSRSAALC:iAYLMKYHAMSr. 

I^AHTWTKSCRPIIRPNSGFWEQLIHYEFQLFGKNTVHMVSSPV 
GMIPDIYEKEVRXiM IPL 

iKi^)KKTAHNI.ENVLIHFWER UiElCVAKISEVKADVKSVLQVS 



X ti.AUCKXAHWI.ENVLIHFWE RUiElCVAKISEVKADVKSVLQVS 
Wl^VLQKPKGSLKSSKKKWGKVRFADErLESNKENEKCVSSEG 
EKIECWELTTEPSLTHNSSGJUX^PUIKKPLEDLVCKLADISINY 
VNERKSEQHLRFLSTLLDSPSSSRVFKMI^LGDEKQSIVQAKPLB 
IAKLVQKNPAVQFI,YQKLIGWI>NEDQRKDRJFLVDII>YSAI.RCC 
DNDMERKKVUDDLTKVDLKWNSLLKI lEKACPSSDKHALVTPWL 
KGDILGEKLVNLADdiCNEDLESRVSSESHFSERWTLL^SLVLSO 
HVKNDYLXGDVYVERIIVRLHBTUFKTiCKLSEAESSDSSVSFIC 
D^/AyWYFSSAKGCLLMPSSEDLLLTLFQLCAQSKEKTHLPDFLI 
CKbKrrrWW;GVNLLVHQTDSSYKESTFr.HLSALWr.KMQVQASSI. 



in«wi>£^MQWl.HRPLLEGRLSLWYECFKTDFKEQDIKTLPSHLCT 
SAIiLSKMVLXAlJiKETVLENNELEKlIAELLYSLQWCEEI£)NPP 
IFLIGPCEILQKMNITYDNLRVLGNMSGLLQr.LFNRSREHGTLW 
SLIIAKLILSRSISSDEVKPHYKRKESFFPLTEGKLHTIQSLCP 
FLSKEEKKEFSAQCIPALLGWTKKDLCSTNGGFGHLAIPNSCLO 
TKS IDDGE:,IJI6ILKI I ISWKKEHEDIPI,FSCNLSEASPEVLGV 
NIEIIRFLSLPLKYCSSPLAESEWDFIMCSMIAWLETTSENQAL 
YSIPLVQLFACVSCDLACDLSAPPt)SrTLDTIGNLPVNt,rSEWK 
EFPSQGIHSLLLPILVTVTGENKDVSETSFQNAMLKPMCSTLTY 
ISKEQLI^HKI^PARLVADQKTinbPEYIiQTLLNTr^LLLPRARP 
VQIAVYHMIiYIOMPELPQYDQDNLKSYGDEEEEPALSPPAALMS 
r,LSIOEDX,LENVLGCIPVGQIVTlKPLSEDFCyVIiGYX,LTWKLl 
LTFFKAASSQLRALYSMYLRKTKSLNKLLYHLFRLMPENPTYAE 

TAVEV?NKDPKTPPTEET.nT.C TtJ O'nfnwr «VTr^«,T^ * 2~ 



STQLFNGMTVKARATTREVMATYTIEDIVIELIXQBPSmrPl^S 

IIVESGKRVGVAVCKJWRNWMLQLSTYLTHQNGSIMEGLALWKNK 

VDKRFEG\^DCMICFSVIHGFWYSLPKKACRTCKKKFHSA\CLY 
KWFTSSNKSTCSLCRETFF ^X^^ii 

MAEIAMsPs5lKASVg5ra^^ 

GSPAPSHSLPLQPRSGGSLCPSRAW/PDPHQLFDDTSSAQSRGY 
GAQRAPGGI^YPAASPTPHAAPrjy>PVSNMAMAyGSSrAAQGKE 
LVDKNIDRFIP iTKLKYYFAVDTMYVGRiOGLLFPPYLHQDWEV 
QYCQDTPVAPRPDVNAPDLYrPAMAFlTYVLVAGLALGTQDRFS. 
PDLIXSI^ASSAlAWLTLEVlAII^SI^YLVrVNTDLTTlDLVAP? 
GYKYVGMIGGVl^GLr.FGKIGYYLVLGWCCVAIPVFWIRTI.RLK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing 3ignal peptide 
{A*=Alanine, C= Cysteine, D=Aspartic Acid, e;= 
Glutamic Acid, F^Phen/lalanzne, G-Glycine, 
H=Histidine, I-Isoleucine, K-Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P«Proline, Qs^Glutaraine, R=Arginine, 
S^Serine, Ts=Threonine, V;=Valine, 
W=Tryptophan, Y-^Tyrosine, X=Unknown, *^Stop 
Codan, /-^possible nucleotide deletion, 
\«posaible nucleotide insertion) 


6043 


403 


599 


iIiADAAAEGVPVRGAimQLRMYLTMAVAAAQPMLMYWI.TFHLVR 

LCLFFFFPCATPVLPliPSLISAl./CLSHLSVSSWFCPCQPPd?^ 

PLPPLQNKTAKGSLSTEQSERG 


6044 


793 


412 


KLEMWNFTLISKVKISREVTMIASKFGIGQQVRHSLLGYLGVW 
DIDP VYSLS E PSPDELAVNDELRAAPWYHWMEDDNGLPVHTYIi 
AEAQLSSELQDBHPXEQPSMDELAQTIRKQIiQAPRLRN 


604S 


1S5 


2299 


SPIiPQVAAMWYLRRRLSDSNFMANLPKGYMTDLQRPQPPPPPPG 

AHSPGATPGPGTATAERSSGVAPAASPAAPSPGSSGGGGFFSSL 

owrtv ts^v I xju\j\m\it i>ay vtaGGSGGAGRGGAASRVLLVIDEPKT 

DWAKYPKGKKIHGBIDIKVEQAEFSDLNLVAHANGGFSVDMEVIi 

RNGVKWRSLKPDFVLIRQHAFSMARNGDYRSLVIGLQYAGIPS 

VNSLHSVYNFCDKPWVFAQMVRLHKKXGTESFPLIDQTFYPNHK 

EMLSSVTTYPVWKMGHGTLWGWGKVKVDNQHDFQDIASWALT 

KTYATAEPFIDAKYDVRVQKIGQNYKAYMRTSVSGNWKTNTGSA 

MLEQIAMSDRYFOiWVDTCSEIFGGLDICAVEALHGKDGRDttllE 

VVl?i>i5Tqt'ljIGDHQDEDKQLIvELVVNKMA0^ 

GSHGQTPSPGALPLGRQTSQQPAGPPACX)RPPPQGGPPQPGPGP 

QRQGPPLQQRPPPQGQQHLSGLGPPAGSPI.PQRIjPSPTSAPQQP 

ASQAAPPTQGQGRQSRPVAGGPGAPPAARPPASPSPQRQAGPPQ 

atrqtsvsgpappkajsgappggqqrqgppqkppgpagptrqasq 

AGPVPRTGPPTTQQPRPSGPGPAGRPKPQLAQKPSQDVPPPATA 

aaggpphpqlnksqsltnapnlpepapprpslsqdevkabtirs 
lrkspaslfsd 


6046 


212 


1075 


egltcpcervpfllgrgrphgatraghrravrwagpeslppLpr 
slimdspragthqgpldabtevgadrcxstayqeqrpqveqvgk 
qaplspglpamggpgpgpcedpagaggagaggsepiivtvtvqca 

wvpl peeeslqrawqdaaacprglqlqcrgaggrpvlyqwaqh 

SYSAQGPEDLGFRQGDTVDVLCEVDQAWIjEGHCDGRIGI FPKCF ' 
WPAGPRMSGAPGRLPRSQQGDQP 


6047 

! 


49 


1405 


PVLVTSLRMREADTrJiPPQLMEVSADI ISTVEFNHTGELLATGD ' 

KGGRWIPQREPESKNAPHSQGEYDVYSTFQSHEPEPDYLKSIiE 

lEEKINKIKWLPQQNAAHS LLSTNDKTIKLWK I TERDKRPE6YN 

UCDEEGKLKDLSTVTSLQVP VLKPMDLMVEVS PRRI PANGHTYH 

INSISVNSDCETYMSADDLRimiWHIAITDRSFTPVNIVDIKPA 

NMEDLTEVITASEFHPHHCNLFVYSSSKGSLRLCDMRAAALCDK 

HSKLFEEPEDPSNRSFFSE t IS\SVSDVKFSHSDRY^!LTR\DYb 

TVKWrDL\NMEARPIETYQVHDYLRSKLCSLYENDCIFDKFECA 

WNGSDSVrMTGA\YNNFFRMPDRNTKRDVTL\EASRESSKPRAV 

LKPRRVCVGGKRRRDDISVDSLDFTKKILHTAWHPAENIIAIAA 

TNNLYIPQDKVNSDMH 


6048 


1 


3194 


GIRTPKFCDSPTSDLEMRNGRGRGKRMRPNSNTPVWETATASDS""'" 

KGTSNSSKTRAGANSKGRRGSQNSSEHRPPASSTSEDVKASPSS 

ANKRKNKPI>SDMELWSSSEDSKGSKRVRTNSMGSATGPr,PGTKV 

EPTVLDRNCPSPVLIDCPHPNCWKKYKHINGLKYHQAHAHTDDD 

SKPEADGDSEYGEEPILHADLGSCNG\ASVSQK\GSLSPARSAT 

PKVRLVEPHSPSPSSKFSTKGLCKKKTiSGBGDTDLGALSNDGSD 

DGPSVMDETSNDAFDSLERKCMEKEKCKKPSSLKPEKIPSKSIjK 

S ARPI /APIAI PPQQIYTFQTATFTAAS POSSSGLTATVAQAMP 

NSPQLKP rQPKPTVMGEPFTVNPALTPAKDKKKKDKKKKESSKE 

LESPLTPGKVCRAEEGKSPFRESSGNGMKMEGIiliNGSSDPHQSR 

IiASIKAEADKI YS FTDNAPSPS XGGSSRLENTTPTQPLTPLHW 

TQNGAEASS VKTNS PAYSDISDAGEDGEGKVDSVKSKDAEQLVK 

EGAiCKTDFPPQPQSKDSPYYQGPESYYS PS VAQSS PGALNPSSQ 

AGVESQTUiKTKRDEEPESIEGKVKNDI CEEKKPEIiSSSSQQ PSV 

IQQRPKMYMQSLYYNQYAYVPPYGYSDQSYHTHtiLSTNTAYRQQ 

yEEQQKRQStiEQQQRGVDKKAEMGLKEREAAliKEEWKQKPSIPP 

TLTKAPSLTDLVKSGPGKAKEPGADPAKSVIIPKLDDSSKI.PGQ 

APEGLKVTCLSDASHLSKEASBAKTGAECGRQAEMDPILMYRQEA 

EPRMWTYVYPAKYSDIKSEDERWKEERDRKLKEERSRSKDSVPK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acxa segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E« 
Glutamic Acid, F=: Phenyl alanine, G^Glveinf- 
H«Histidxne, I^Isoleucine , K=Ly£ine/ 
L^Leucine, M=Methionine, N^^Asparagine, 
P-Proline, Q^Glutamine, R=Argxnine, 
S=Serine, T-Threonine, V=Valine 
W=Tryptophan, y.Tyrosine, X=UnkAown, *=stop 
Codon, /trpossible nucleotide deletion, ' 
\==possxble nucleotide insertion) 


6049 






ED^STSSDCKbPTSEESRLGSKbPKPSVHVPVySPLTQHOSy- 
rPyMHGYSySQSYDPNHPSYRSMPAV^5M0NyPGSYLPSSYS^SP 
VGSKVSGGEDADKARASPSVTCKSSSESKALDILQQHASHYKSK 
SPTISDKTSQERDRGGCGWGGGGSCSSVGGASGGERSVDRPRT 

stpslypSrr^^^^^^^"^^^^*^ 


6050 


215 


1089 


AMTGVFDRRVt>SIRSGDFQAPFQTSAAMHHPSOKSPTI,PESSAT^ 
©SDYYSPTGGAPHGYCSPTSASYGXKALNPYQYQYHGVNGSAGS 
YPAKAYADYS YASS YHQYGGAYNRVPSATNOPPinrvTvi^ oc^rriMtr 

NGKPKKVRKPRTIYSSFQIAALQRRFQKTQYIAbPERAELAASL 
GLTQTQVKIWFXJNKRSKIKKIMKNGEMPPEHSPSSSDPMACNSP 
QSPAVWEPQGSSRSLSHHPHAHPPTSNQSPASSYLENSASWYTS 
AASSXNSHI,PPPGSI.OHPIALASGTLy 


6051 


566 


1718 


KGLERTCCAMfc;ESDSEKTTKKKNLGPRMDPPUjEPQ\G$LGWVL " 
PNTAMKKKVLLMGKSGSGKTSMRSIIFAWYIARDTRRLGATILD 
RIHSLQINSSI,STYSI.VDSVGNTKTrDVEHSHVRFLGNLVUIt.W 
DCGGQDTPMENYFTS0RD^7IFHNVEVLI YVFDVHSRBLEKDMHY 
YQSCI»EAlLONSPDAKI PCLVHIfMrtr.uni?nrit>rir T-nt^msT^i^*^* 

RLSRPIaECSCFRTSIWDETLYKAWSSXVYQLrPMVQQLEMNLRN 
PAErrBADEVLLPERATPLVISHYQCKEQRDAHRFEKISNIIKQ 
FKLSCSKLAASFQSMEVRUSNFAAFIDIFTSNTYVMWMSDPSI 
PSAATL1N1RNARKHFEKLERVDGPKQCLW5R 


6052 


566 


1718 


KUX^ERTCCAMKhiSDSBKTTKKHtjLGPRMDPPl^KPGVGSLGWVI. 
PNTAMKKKVLLMGKSGSGKTSMRSIIPANYIARDTRRLGATILD 
RIHSLQlNSSIiSTYSIiVDSVGNTKrFDVEHSHVRFlX3NI.VLNLW 
DCGGQDTPMENYFTSQRDNIFRNVEVLIYVFDVESRELEKDMHY 
YQS CLEAI liQNSPDAKI FCL VHKMDLVDEDOP nr .T Pif i?t> T7TrnT r. 

RI»SRPJ:.ECSCFRTSIWDETt.YKAWSSIVyQLIPNVQQLEMNI*RN 
FAEIIEADEVLLFERATFLVISHYQCKEQRDAHRPEKISNIim ^ v 
FKLSCSKIJUVSFQSMEVRNSNFAAFIDIFTSNTYVMVVMSDPST ^ 
PSAATLINIRNARKHFEKLERVDGPKQCLLMR 


6053 


566 


1718 


i^OiihRTCCAMEESDSEKTTEKENLGPRMDPPI>GEP6VGSLGWVL 
PNTAMKXKVLLMGKSGSGKTSMRSIIFANYIARDTRRLGATILD 
RIHSbQINSSLSTYSLVBSVGNTKTPDVEHSHVRFLGNLVLNLW 
DCGGQDTFMENYFTSQRDNlFRNVEVLIYVFDVESREI,EKDMHy 
YQSCLEAILQNSPDAKIFCLVHKMDLVQEDQRDLIFKEREEDX^ 
RLSRPLECSCFRTS I WDETLYKAWSS I VyQLIPNVQQliEMNLRW 
FAEIIEADEVLLFERATPLVISHYQCKEQRDAHRFEKISNIIKQ 
PKLSCSKnAASFQSMEVRNSNFAAFlDIFTSNTYVMWMSDPSI 
PSAATLINIRNARKHFEKLERVDGPKQCLLMR 


6054 


201 


1704 

] 
i 


KCTEMWKSRWQSRRRHGl^jjHUUNPWFRLRDSBDRSDSRAAQPA ' 

HDSGHGDDESPSTSSGTAGTSSVPELPGFYFDPEKfCRYFRLLPG 

HNNCKPLTKESIRQKBMESKRLRLLQEEDRRKKIARMGFNASSM 

LRKSQLGFLNVTNYCHLAHELRLSCMERKKVQIRSMDPSAIASD 

RFNLILADTNSDRLFTVNDVTVGGSKYGIINLQSLKTPTLKVFM 

HENLYFTWRKVNNSVCWASWJHLDSHILLCIMGIiAETPGCATLL 

PASLFVWSHPAGIDRPG\MLCSFRIPGAWSCAWSLNIQANNCFS 

C3:iIFAIDLRCGNC3GKGWKATRLPHDSAVTSVRlI.QDEQYLMASD 

MAGKIKLWDLRTTKCVRQYEGHVNEYAyLPLHVHEEEGILVAVG 

3DCYTRIWSI.HDARLI.RTIPSPYPASKADIPSVAFS$RLGGSRG 
^PGIJJ^VGQDLYCYSys ' 




1 


1054 ] 
I 
\ 
I 
C 
I 
\ 
f 


^r'lAKM21::FGTSRRHMAAl>SGVHt.LVRkGiiHklFSSPLNHIYLH " 

CQSSSQQRRNFFFRRQRIDISHSIVUPAAVSSAHPVPKHIKKPDY 

mGIVPDWGDSIEVKNSDQIQGLHQACQUJiHVLLLAGKSLKV 

JMTTBEIDALVHREIISKNAYPSPLGYGOFPKSVCTSVNNVLCH 

;i?DSRPLQDGDIIKIDVlVYYNGYHQDTSETFLVGNVDEOGKK 

iVEVARRCRDEAlAACRAGAPPSVIGNTISHITHQNGFQVCPHF 

^GHGIGSYFHGHPEIWHHANDSDLPMBEGMAFTrEPlITEGSPK 

'KVLEDAWTWSLD/TSKVSAQFEHTVLITSRGAQrT.TTrT.PFFf 
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SEQ 
ID 
NO: 

605S 


preaicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
iresidue of 
amino acid 
seqfuence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amj.no acid segment containing signal pepta^T" 
(A=Alanine, C=*Cysteine, D=Aspartic Acid E» 
Glutamic Acid. F-Phenylalanine, G=Glycine, 
H^Hrstidine, l^Isoleucine, K^Lysine, 
L=Leucine, M=^Methionine, N=Asparagine, 
P«Proline, Q^Glut amine, R-Arginine, 
Ss^Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y-Tyrosine, X^Unknown, *:=stop 
Codon, /^.possible nucleotide deletion, 
X'spossible nucleotide insertion) 


6056 


421 


2364 


PPYFLLS FIAWWLYGO^imTETD'lSOqAnPPPrt'rTY^r'q&T uljAt^ - 

GCANCSRFCRDCSPPACQCHTKVFPGNALNG^/QPPELSRTU^I 

SSREPPRKKKKSQTETGKERERTSPLTQGGKRFEXQHGLAGICM 

TLLITGDSIVSAEAVWDHVTMAWREIAFKAGDVIKVLDASNKDW 

WWGQIDDEEGWFPASFVRLWVNHEDEVEEQPSDVQNGHLDPNSD 

CLCLGRPLQNRDQMRAJNVINEIMSTERHyiKHUODICEGYLKQC 

RKRRDMFSDEQLKVIPGNIEDIYRPQMGFVRDLEKQYUNDDPHI, 

SEIGPCFLEHQDGF>?IYSBYCKNHLI>ACMEX*SICUJKDSRYQHFF 

EACRLLQQMIDIA\IDGFLLTPVQKICKYPLQLAELLKYTAQDH 

SDYRYVAAALAVMRNVTQQINERKRRLENIDKIAQWQASVLDWE 

GBDILDRSSELIYTGEMAWIYQPWGRNQQRVFFLFDHQMVLCK 

KDI.IRRDILYYKGRIDMDKYEVVDIEDGRPDDFNVSMKNAFKLH 

NKETEEIHLFFAKKLEEKIRWLRAFREERKMVQEDEKIGPBISE 

NQKRQAAMTVRKVPKQKGVNSARSVPPSYPPPQDPLKHGQYI.VP 

\DGIAQ3QVFEFTEPKRSQSPFWQNFSRLTPPKK 




43 


3358 


iiGGRGPVRVRSEQI,SPSAEQVSQISQISLGRRPLS5LPPPPSRA ' 

lAPTRAPDTALTIMEVAEVESPLNPSCKlMTPRPSMEEFREFNK 

YIAYMESKGAHRAGIAKVIPPKEWKPRQCYDDIDNItLIPAPICm 

MVTGQSGLFTQYNIQKKAMTVKEFRQLANSGKYCTPRYLDYEDL 

ERKYWKNLTFVAPiyGADINGSIYDEGVDEKNIARl4NTVX>DVVE 

EECGISIBGVNTPYLYFGMWKTTFAWHTEDMDLYSINyLHFGEP 

KSWYAr?PEHGKRI.ERLA0GFFPSSSOGCDAPLRHKMTLISPSV 

LKKYGI^FDKITQEAGEFMITPPYGYHAGPNHGFNCAESTNFAT 

VRWIDYGKVAKLCTCRKDMVKISMDIPVRKFQPDRyQLWKQGKD 

IYTIDHTKPTPASTPEVKAWI>QRRRKVRKASRSFQCARSTSKRP 

K?U)EEEEVSDEVDGAEVPNPDSVTO0IiKVSEKSBAAVKIJWTEA 
SSEEESSASRMOVEONTi*inKTWT.cr'M«a/*'T <!T£^Yr«r>ws<rvnm«^«s.*» 

YAYRSVPSISSEADDSIPIiSTGYEKPEKSDPSEI^SWPKSPESCS 
SVAESNGVLTEGEESDVESHGNGLEPGEIPAVPSGERNSPKVPS 
lAEGENKTSKSWRHPIiSRPPARSPMTLVKQQAPSDEELPEVLSX 
EEEVEETESWAKPLIHLWQTKPPNFAABQBYNATVARMKPHCAX 
CTLLMPYHKPDSSNEENDARWETIOuDEVVXSEGKTKPLIPEMCF 
IYSEBNIEYSPPNAFLEEDGTSLr,ISCAKCCVRVHASCYGIPSH 
E ICDGWLCARCKRNAWTAECCLCtniRGGALKOTKKrWKWamrMr'a 

VAVPEVRFTNVPERTQIDVGRIPIiORLKLKCIFCRHRVKRVSGA 

CIQCSYGRCPASFHVTCAHAAGVL\MEPDDWPYVVNITCPRHKV 

NPNVKSKACEKVISVGQTVITKHRKTRYYSCRVMAVTSQTFYEV 

MPDDGS FSRDTFPEDI VSRDCLKU3PPAEGEWQVKWPDGKI,yG 

AKYFGSNIAHMYQVEPEDGSQIAMKREDIYTIJ3BELPKRVKARF 

VSAGRCHLGTCQVNSLSSPHVSQAQQETYLGFWINSKKSQCWIF 
LSGTY 


60S7 


1 


853 


FVAR1.KEQEGEGGLGPRKEKGRARGREKRRKMQLTRCCFVF1*^?Q" 

GSLYLVICGQDDGPPGSEDPBRDDHEGQPRPRVPRKRGHISPKS 

RP^WNSTLLGLLAPPGEAWGILGCPPNRPNHSPPPSAKVKKIFG 

WGDFYSNrKTVArJMLLVTGKIVDHGNGTFSVHFQHNATGQGNiS 

ISLVPPSKAVEFHQEQQIP1EAKASKIPNC\RMEWEKVE\RGRR 

TSLFTHDPAKICSRDHAQSSATWSCSQPFKWCVYIAFYSTDYR 


6058 
6059 


1 


986 


HFJ^PSASLGLPSVSr^VSLCVRSALLEAVVPMI^KRRRARVGS 

SGDAASSTPPSTRFPGVAIYLVEPRMGRSRRAFLTGLARSKGPR 

VLDACSSEATHWMEETSAEEAVSWQERRMAAAPPGCTPPALLD 

ISWLTESLGAGQPVPVECRHRLEVAGPSKGPLSPAVfMPAYACQR 

PTPLTHHNTGI^EALEIIAEAAGFBGSEGRI^TPCKAASVrJW^ 

PSP VTTLSQLQGLPHFGEHSSRWQEIiIiEHQVCEEVERVRRSB / 

RLFTQIPGVGVKTADRWYREGLRTLDDLREQPQKLTQQQKAGBP 

SRJSAGPWASLNCrLDPSAST? 




2 


3650 

( 

1 
1 


aQDFESLADLTDHUUUlRCPGMDDDPQLSWVASSPSSKbVASPT 
2MIGDGCDLGLGEEEGGTGI.PYPCQFCDKSFlRLSyLKRHBQlH 
SDKLPFKCTYCSRLFKHKRSRDRHIKXaTGDKKYHCHECEATVFS 
^SDHI*KXHI*KTHSSSKPFKCTVCKRGFSSTSSLQSHMQAHKKN|f 
3HLAKS EKEAKKDDFMCDYCEDTFSQTEELEICHVLTRHPQLSEK 
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'-NSDOCin: <WO 015331PA1 ! > 



wo 01/53312 



PCT/USOO/34263 



SEQ 
ID 
NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of - 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, DsAspartic Acid^ E= 
Olucamic Acid, FssPhenyialanine, G=Glycine, 
H=Histidine, lalsoleucine, K=Lysine, 
Li= Leucine, M=Methionine, N»A3paira3ine, 
P=Proline, Q=Glutamine, R=Arginine, 
S«Serine, T=Threonine, V= Valine, 
w=Tryptophan, y=Tyrosine, X-Unknown, *=Stop ^ 
Codon, /=possible nucleotide deletion, 
\-posslble nucleotide insertion) 








ADLQCIHCPEVFVDENTLLAHIHQAHANQKH:<CPMCPE\QFSay 
\EGVyCHLDSHRQPDS SNHS VS PDPVLGS VASMSSATPDSSA^ 
BRGSrPDSTLKPLRGQKKMRDPGQGWTKWYSCPYCSKRDPNSI* 
AVLEIHLKTI^lADKPQQSHTCQlCtiDSMPTLYNtJJEHVRKriHKN 
HAYPVMQFGNISAFHCNYCPEMFADINSLQEHIRVSHCGPNANP 
SDGNNAFFCWQCSKGFLTESSLTEHIQ\Q\AHCSVGSAKliESPV 
VQPTQSFMEVYSCPYCTNSPIFGSrLKLTKHIKEKHKMIPLAHS 
KKSKAEQSPVSSDVEVSSPKRQRLSASANSISNGEYPCMQCDLK 
FSNFESFQTHLXLHLELLLRKQACPQCKEDFDSQESLLQHLTVH 
yMTTSTHyVCESCDKQFSSVDD\LQKH\Lt,B«PHPr*CCTHCT\I. 
CQEVFDS\KVS I \QVHLAVKHSNEKKMYRCTACNWDFRKEADIiQ 
VHVKHSHIiGNPAKAHKC I FOSETFSTBVEItOCHr TTHS KKYNCK 
FCS KAFHAI ILLEKHLREKHCVFDAATENGTANGVPPMATKKAB 
PADLQGMLLKNPEAPNSHEASEDDVDASEPMYGCDICGAAYrrME 
VLLQNHRLRDHNIRPGEDDGSRKKAJEFIKGSHKCNVCSRTFFSE 
ISK^.LREHKlTHRGPAKHYMCPICGERPPSLriTLTEHKVTHSKSLD 
TGTCRICKMPLQSEEEFIEHCCMHPDLRNSLTGFRCWCMQTVT 
STLELKIHGTFHMQKLAGSSAASSPNGQQLQKLYKCALCIiKBBTR 
SKQDLVKLDVNGIiPYGIiCAGCMARSANGQVGGLAPPEPADRPCA 
GI*RCPECSVKFESAEDIiESHMQVDHRDLTPETSGPRKGTQTSPV 
PRKKTYQCIKCQMTFENERE I QIHVANHMIEEGINHECKLCNQM 
FDSPAKLLCHLIEHSFBGMGGTFKCPVCFTVFVQANKLQQHIFA 
VHGQEDKIYDCSQCPQKFFFQTELQMHTMSQHAQ 


6060 


2145 


202 


SyEIVGKNKI,EVNHSQI.KAI*CKCSI>PSRLLPl/3ENI,PI,LDRGFR " 

KEPRSRGSRERDNMLHI^HHSCLCFRSWLPAMLAVLbSIiAPSASS 

DISASRPNILLIJ4At)DI/GIGDIGCYGWNTMRTPNIDRIiABDGVK 

LTQK1SAASLCTPSRAAFI.TGRYPVRSGMVSSIGYRVLQWTGAS 

GGLPTNETTPAKIl.EBKGYATGI*IGKWHIiGIiNCESASDHCHHPIi 

HHGFDHFYGMPFSLMGDCARWELSEKRVNLEQICLNFLFQVIASV 

ALTLVAGiCLTHI,IPVSWMPVIWSAI*SAVLLI*ASSYFVGAI.IVHA 

DCFLMRNHTITEQPMCFQRTTPLlIiQEVASFLKRNKHSPPLLFV 

S FLHVHI Pr»lTHEWFljGKSl»HGt»YGDNVKEMDWMVGRILDTIJ>V 

EGLSNSTLIYPTSDHGGSLENQLGNTQYGGWNGIYKOOKGMGGW 

EGG1RVPGIFRWPGVLPAGRVIGEPTSI^VFPTVVRI.AGSEVP 

QDRVIDGQDLLPIiliLGTAQHSDHEFLMttYCERPIiHAARWHQRDR 

GTMWKVHFVTPVFQPEGAGACYGRKVCPCFGEKWHHDPFLXiFD 

LSRDPSETHILTPASEPVFYQVMERNVQQAVWEHQRTliSPVPLQ 

IJDRIiGNIWRPWLQPCCGPFPX>CWCI>REDDPQ 


6061 


110 


1330 


MNIHMKRKTIKNINTFENRMIiMIiDGMPAVRVKTELIiESEQGSPN 
VHNYPDMEAVPLLLNKVKGEPPEDSLSVDHFQTQTEPVDLS INK 
ARTSPTAVSSSPVSMTASASSPSSTSTSSSSSSRIASSPTVITS 
VSSASSSSTVIfTPGPIiVASASGVGGQQFIiHIIHPVPPSSPMKLQ 
SNKI^HVHRIPVWQSVPVVYTAVRSPGimJNTIWPrjIJSDGRG 

hgkaqmdprgiis prqsksdsddddlpnvtldsvnetgstai.s la 
ravqevhpspvsrvrgnrmnnqkfpcsispfsiestrrqrtviin 
ppdsrktaystdcdpXeglqqklytkssspgrvhrrthtgekpy 

lalhrrrhmlv 


6062 


71 


1079 


ETMAKNGPENCEDCHILMAEAFKSKKICKSLKICGIiVTGIIiTUiT 
LIVIiPWGSKHFWPEVPKKAYDMEHTFYSNGEKKKIYMEIDPVTR 
TEIFRSGNGTDEILEVHDFKNGYTGIYFVGLQKCFIKTQI KVI P 
EFSEPEEEIDENEEITTTFFEQSVIWVPAEKPIENRDFLKNSKI 
LE r CDNVTMYW\ INPTL\ISGTFAKQLHHNFAFI ILVSELQDFE 
EEGEDLHFPANEKKGIEQNEQWWPQVKVEKTRHARQASEEELP 
INDYTENGIEFDPMI.DERGYCCIYCRRGNRyCRRVCEPIiIK3yYP 
YPYCYQGGRVXCRVIMPCNWWVARMLGRV 


6063 


71 


1079 


ETMAKNGPENCEDCHILNAEAFKSKKICKSIiKICGI*VFGII»ALT 
LIVLFWGSiCHFWPEVPKKAYDMEHTFySNGEKKiaYMEIDPVTR 
TBIFRSGNGTDBTLEVHDFKNGYTGIYFVGLQKCFIKTQIKVIP . 
EFSEPEEEIDEMEEITTTFFEQSVIWVPAEKPIBNRDFLKNSKJT 
liBI CDNVTMYW\ INPTL\ ISGTFAKQIiHHNFAFIILVSE^QDFE 
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wo 01/53312 



PCTAJSOO/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

CO r re spend i ng 

amino acid 
residue of 
amino acid 


A.-iiino acid segment containing signal peptide 
(A=Alanxne, C»Cyeteine, D=Aspartic Acid E- 
Glutamic Acid. F-Phenylalanine, G=Glycine, 
H-.Hx3tidine, I=rsoleucine, K=tjysine, 
Ii^Leucine, M-Methionine, N^Asparagine, 
P^Proline, Q=GluCamine, R=Arginine, 
S=Ser ine , T-Threoni ne , V=sVal ine , 
"^Tryptophan, Y^Tyrosine, X-Unknown, *^stop 
Codon, /^possible nucleotide deletion, 
\=poseible nucleotide insertion) 


6064 






EEGbui.Hi-t^ANh:KKGIEQN£;uWVV^PQVKV£KTRHARUASEEELP " 

INDYTENGrEFDPMLDERGYCCIYCRRGNRYCRRVCKPLLG^Yp 

yPVCYOGGRVICRVIMPCWWWVARMLGRV 


6065 


913 


311 


MA.PQSJLPRPTEHSPPYSLHKMTDLVAVWDVALSDGVHKIEFEHG^ 
TTSGKRVVYVDGKEEIRKEWMFICr.VGKETFYVGAAKTKATINID 
AISGPAYEYTLEINGKSLKKYMEDRSKTTNTWVLHMDGENFRIV 
ItEKDAMDVWCNGKKIiETAGE FVDDGTETHFS IGTH\ACYI KAV\ 
_ SSQ\KRKEGI IHTLIVDNREIPEIAS 


6066 


1153 


641 


MSVRVARVAWVRGLGAS 5(rkHGASSFPVPPPGAQGVAELI.RDATG 
AEEEAPWAATERRMPGQCSVLLPPGQGSQWGMGRGLLNYPRVIl 
BLYAAAERVLGYDLLELSLHGPQETLDRTVHCQPAIFVASLAAV 
EKLHHLQPSVIENCVAAAGPSVGEFAALVFAGAMEFAEG 


eosr 


68 


3470 


VKENMPATUKPMRYGHTEGHTEVCF3DSGSF1VTCGSDGDVRIW- 

EDLDDDDPKFINVGEKAYSCALKSGKLVTAVSNNTIQVHTPPEG 

VPDGILTRFTTNANHVVFNGDGTKIAAGSSD\FLVKIVDVMDSS 

QQKTFRGHDAPVt^SbSFDPKDIFLASASCDGSVRVWQISDQTCA 

ISWPIiLQKCNDVINAKS 1 CRLAWQPKSGJaiAI PVEKB VKLYRR 

ES WSHQFDLSDNFISQTLMI VTWSPCGQYIAAGSINGLI I VWNV 

ETiOSCMERVKHEKGYAICGLAWHPTCGRISYTDAEQNLGI^ENV 

CDPSGKTSSSKVSSRVEKDYNDLFDGDDMSNAGDFLNDNAVEIP 

SFSKGlINDOEDDBDLMMASGRPRQRSHXLEDDENSVDISMLKr 

GSSLIiKSEEEDGQEGSIHNLPLVTSQRPFYDGPMPTPRQKPFQS 

GSTPLHLTHRFMVWNSIGIIRCYNDEQDNAIDVEFHDTSIHHAT 

HLSNTLWYTIADLSHEAir.LACESTDEIASiaHCIiHFSSWDSSK 

EWIIDLPQNEDIEAICLGQGWAAAATSAI^LRLPTIGGVQKEVF 

S]UAGP\A^SMAGHGEQI.PIVYHRGTGPDGDQCLGVQLLELGKKKK 

QILHGDPLPLTRKSYIAWIGFSAEGTPCYVDSEGIVRMLNRGLG 

NTWTPICNTREHCKGKSDHYWWGIHENPQQLRCIPCKGSRFPP 

TI,PRPAVAILSFKLPYCQIATEKGQMEEQFWRSVIFHNHIJ5YLA ^ 

KNGyEYEBSTKNQATKEQQEr.LMKMIALSCKLEREFRCVBIADI, 

MTQNAVNLAIKYASRSRKLILAQKLSELAVEKAAELTATQVEEE 

EEEBDFRKKIxKAGYSNTATEWSQPRFRWQVEEDAEDSGEADDEE 

KPEIHKPGQNSFSKSTNSSDVSAKSGAVTFSSQGRVNPFKVSAS 

SKEPAMSMNSARSTNILDNMGKSSKKSTAJCSRTTNNBKSPIIKP 

LIPKPK^KQASAASYFQKRNSQTNKTEEVKEENLKtIVI,SETPAI 

CPPCNTENORPKTGFQMWLEENRSNILSDNPDPSDEADIIKEGM 

I-^FRVLSTHERKVWANKAKGETASEGTEAKKRKRWDESDETEK 

QEEKAKENl^LSKKOKPLDFSTNQKLSAFAFKQE 


6068 


858 


321 


LPWQRLGVLljSRGKMAVTGWLESUiTAQKTAI*LQDGRRia«IYLF 
PDGKEMAEEYDEKTSELLVRKWRVKSA2.GAMGQWQLEVGDPAPI, 
GAGNLGPELIKESNANPIPMRKDTKMSPQWRIRNLPYPKDVYSV 
^^KERCI I VRTTNKKYYKKFS 1PDLDRHQLPI,DDAU,SFA\T 


6069 


13 


1730 

■ 

] 
1 


GSKMADLAWEEKPAIAPPVFVFQKDKGQKSPAEQKNI.SDSGEEP 
RGEAEAPHHGTGHPESAGEHALEPPAPAGASASTPPPPAPEAQL 
PPFPRELAGRSAGGSSPEGGEDSDREDGNYCPPVKRERTSSliTQ 
FPPSOSEERSSGFRLKPPTLIHGOAPSAGLPflOKPKEOOP c3Vt p 

PAVLQAPQPKALSQTVPSSGTNGVSLPADCTGAVPAASPDTAAW 

RSPSEAADEVCALBEKEPQKNESSWASEEBACEKKDPATQQAFV 

FGQNtiRDRVKLlNESVDEADMENAGHPSADTPTAIWYFLQYISS 

SLENSTNSADASSNKFVFGQNMSERVr^ppKIiNEVSSDANRENA 

WUBSGSESSSQEATPEKESLABSAAAYTKATARKCr>LEKVEVIT 

SEEAESrro^MQCKLFVFDKTSQSWVERGRGLLRLNDMASTDDG 

rUJSRtiSDAGPRGSLRXLILNTKLWAQMQXDKASEKVSIRITAM 

3NEDQGViCVPLrSASSKDTGQVYAALmmiLALRSRVEQEQEAK 

•IPAPEPGAAPSNEEDDSDDDDVrAPSa^^TAAGAGDEGDQQrreS 
P 




583 • 


I 

3 


'TRPGQAGSSSAMAAQRLGKRVLSKLQSPSRARGPGGSPGGIiQK " 

aiARVTVKYDRRSLQRRLDVPKWIDGKLEEr.YRGMEADMPDEIM 

IDELLEDESEEERSRKIQGLbKSOJKPVEDFlQELlJWa^bHft 
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wo 01/53312 



PCT/USOO/34203 



SEO 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAIanine, C=CYstexne, D^Aspartic Acid, E« 
Glutamic Acid, F»*PhenylaXanine, G«Glycine, 
H^Histidine, I=*Isoleucinc, K^Lysine, 
Ii=Iieucine, M^Methionine, N^=Asparagine , 
P«Proline, Q^Glut amine, R^Arginine, 
S=Serine, T=Threonine, VssValine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *=Stop^ 
Codon, /^possible nucleotide deletion, 
\=:possible nucleotide insertion) 








Q \ PGLRQPS PS ? \ DGQPSAPFQG PGARTASPLTLLALFPGPPKR 
RPALLCVLSCI 


6070 


478 


8S8 


IRVTVDGEFbHYIFPLQFX,DSPBW/RFTETHRGRHF\QVTLTAE 
RVYIGKRYKYDIRLPNFYQMSTPEIRRSPLTQHFQNSRRYW 


6071 


2 


1654 


heartkgnmalarpVvrlfslvtrlllaprrgltvrspdeplpv 

VRlPVAriQRQLEQRQSRRRNLPRPVLVRPGPLl.VSARRPBLNQP 

arltlgrweraplasqgwksrrarrdhfsieraqqeapatokt^ 
skgsfadlgawkprvlhalqexaapewqNpttvqsstipsllr 

vjivci V Y wirtJL 4 v» Ot»iVi Ja^» I LiJjFLiIjiQR JjLG \HPSIiDS LPIPAPRGti 

VLVPSRSLAQQVRAVAQPLGRSIiGLbVRDTjKGGHGMRRIRLQLS 

FLELVDYILEKSHIAEGPADLEDPFNPKAQLVLVGATFPEGVGQ 
LLNKVASPDAVTTITSSKLHCIMPHVKQTFLRLKGADKVAELVH 
ILKHRDRAERTGPSGTVLVFCNSSSTVNWIiGYILDDHKIQHIJU* 
QGQMPALMRVGl FQSFQKSSRDII.I*CTDr ASRGIiDSTGVBLWN 
YDFPPTLQDYIHRAGRVGRVGSBVPGTVISFVTHPWDVSIiVQKI 
BLAARRRRSLPGLASSVKEPliPQAT 


6072 


1 


742 


KMERTEMMPTINSQLEFKSKPFPLVSSSRWLVKRGfiiii^pAYVEDT 
V ux-on,K. i. a>iv,yy V lev utfauVL*Z ITKKKSEESYNVNDYSLRDQLL 
VESCDNEELNSSPGKNSSTMtiYSRQSaASHLPTLTVI,SNHANEK 
VEMLLGAETQSERARWITALGHSSGKPPADRTSLTQVEIVRSFT 
AKQPDELSLQVADVVI,I\YQRVSDGWYEGER\LRDGERGWFPME 
GAKEITCQATIDKNVERMGRTjtjGXiETWV 


6073 


620 


860 


PC3iRGLARPLSRRPG/S I LVHCAVGVSRSATLVI*AYIjMLYHHIiT " 

ijv unAivcvv lUVXUvVJX ±triMKXjr J^KviJ-JJUAljlJKKJuJrcQCyXtEA 


6074 


168 


1X10 


PGARCMATiSUSCPDSMPCHNQQVNSASTPSPEQLRPGDLILDHA ' 

ggnrasrakvii^ltgyahsslpaeldsgacggsslnsegnsgsg 

DSSSYBAPAGNSFLEDCELSRQIGAQLKLLPMNDQIRELQTIIR 
DKTASRGDFMFSADRIilRLVVEEGLNQLPYKECMVTTPTCYKyE 
GVKPEKGNCGVSlMRSGEAMEQGLRDCCRSlRIGKIIiIQSDEET 

qrakvyyakfppdiyrrkvllmypilqtgVntvieavkvliehg 

VOPSVIILLSIiFSTPHGAKSriQEFPEITXLlT^VHPVAPTHFG 

qkyfgtd 


6075 


320 


1091 


PPTgQPQEVE:HH\YGYVPILGNKTI.PSRCHQCVIVSSSSHLLGT 

klgpeieraectirmndapttgysadvgnkttyrwahssvfrv 

LRRPQEFVNRTPETVPIFWGPPSKMQKPQGSriVRVIQRAGLVFP 

nmeayavspgrmrqfddlfrgetgkdrekshswlstgwftmvia 

VELCDHVHVYGMVPPNYCSQRPRLQRMPYHYYEPKGPDECVTYI 
QNEHSRKGKHHRFITEKRVPSSWAQLYGITFSHPSWT 


6076 


1721 


107 


HPSPTEAPRVQHLTMDCTWRILFLVAAATGTHAQvbtVQSGAEV 
KKPGASVKVSCKVSGYTLTELSMHWVRQAPGKGLEWMGAPDPBD 
GETIYAQKFQGRVTMTEDTSTDTAYMEt^Srj^SEDTAVYYCATD 
HGDYAF0I WGQGTM VTVSSAPTXAPDVFP I ISGCRHPKDNSPW 
I*ACLITGYH?rsV\TVTWYMGTQSQA\QRTFPEIQRRDSYYMTS 
SQLSTPLQQWRQGEYKCWQHTASKSKKEIFRWPESPKAQASSV 
PTAQPQAEGSLAKATTAPATTRNTGRGGEEKKKEKEKEEQEERE 
TKTPECPSH1X}PI/?VTLLTPAVQDI^LRDKATFTCFWGSDLKD 
AHLTWEVAGICVPTGGVEEGLLERHSNGSQSQHSRLTLPRSLWNA 
GTSVTCriiNHPSI^PPQRLMArJlEPAAQAPVKLSLNLLASSDPPE 
A\ASWLLCEVSGFSPPNnjLMWLEDHGEVNTSGFJ\PARPIi?KP\ 
RSTrFWANWSVTiRVPAPPSPQPATYTCWSHEDSRTLLNASRSL 
EVSYVTDHGPMK 


6077 


3687 


126 B 


LLPDMNLQPl i'WIGLISSVCCVFAQTDENRCLKANAKSCGEtllb 
AGPNCGWCTNSTFLQEGMPTSARCDDLEALKKKGCPPDDIENPR 
GSKIJIKKNKNVTNRSkGrAEKLKPEDITQIQPQQLVLRLRSGEP 
QTFTLKFKRAEDYPIDIiYYIM\DIiSYSMKDDLENVKSLGTDLMN 
EMRRITSDFRIGPGSFVEKIVMPYISrrPAKLRKPCTSEQMCTS 
PFSYKNVLSLTWKGEVFNEtiVGKQRISGNIiDSPEGGFDAIMQVA 
VCGSLXGWRiaWRLLVFSTDAGFHFAGIXSKWKSrVIiPNDGQCfro 
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JNSDOCfO; <WO_0153312At J _> 



wo 01/53312 



PCT/USOO/34263 



SEQ 
ID 
NO: 


Predicted 

beginning 

nucleot ide 

location 

CO rr e spon d i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
secjuence 


Ammo acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
H=Hist idine , I s»lsoleucine , K=t.ysine , 
L=Leucine, M=»Methionine, NsAsparagine, 
P=:Proline, Q=Glutamine, R^Arginine, 
S=Serine. T=Threonine. Vs= Valine, 
W=Tryptophan. y=Tyrasine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, ' 
Xapossible nucleotide insertion) 








ENNMyTMSHyyUY^'aiAHLVQKL'SENNIQTIFAVTEEFQPVYKE " 

LKNLIPKSAVGTIiSANSSNVIQLliDAYNSLSSEVILENGKXBE 

GVTISYQSY\CKNGVNGTGENGRKCSNISIGDBVQFEISITSNK 

CPKKDSDSFKIRPIiGFTEEVEVTLQYICECECQSEGIPESPkCH 

EGNGTFECGACRCNEGRVGRHCECSTDEVNSEDIGCFTARKENQ 

FQKSASNHGRVPSAGQCVCRKRDimiEiySGiCFCECDNFNCDRS 

NGLICGGKGVCKCRVCECNPNYTGSACDCSLDTSTCEASNGQIC 

KGRGICECGVCKCTDPKFXJGQTCEMCQTCLGVCAEHKECVQCRA 

FNKGEKKDTCTQBCSYPNiTKVESRDKIiPQPVQPDPVSHCKEKD 

VDDCWFYFTYSVNGNNEVMVHWEMPECPTGPDIIPIVAGWAG 

I VLIGIjALLLI WKLLM I IHDRREFAKFEKEKMNAKWDTGENP I Y 

KSAVTTWNPKYEGK 


6078 


1426 


180 


ETEDVMELl.EEDLTCPICCSLFDDPRVLPCSHNFCKKCLEGir;E 
GSVRNSLWRPVPFKCPTCRKKTFSYWELIPLQVNYSLKGIVEKY 
NKIKISPKMPVCKGH\LGQPLNIP\CL\TDMQI.DL/CGIC\ATR 
GEHTKHVFCS lEDAYAQBRDAFESLFQSFETWRRGDALSRLDTL 
ETSKRKSLQLliTKBSDKVKEFFEKLQHTLDQKKNEILSDFETMK 
LAVMQAYDPBINKLNTILQEQRMAFNIABAFKDVSEPIVPLQQM 
QEFREKIKVIKETPLPPSNLPASPLMKNPDTSQWEDIKLVDVDK 
LSI.PQDTGTFISfa:PWSFYiCLPU,rLLLGLVlVFGPTMFLEWSt. 
FDDLATWKGCLSNFSSYLTKTADFISQSVFYWEQVTDGFFIFNE 
RFKNFTLWLNNVAEPVCKYKLL 


6079 


X£86 


141 


ATARDLGCARRIDRVVMESTPSRGLNRVHLQCRNLQEFLGGbSP"' 

GVLDRLYGHPATCIiAVFREIjPSLAKNWVMRMLFI.EQPLPQAAVA 

LW\TCKEFSKAQEESTGLLSGIJ^IWHTQLLPGGLQGLimPlPRQ 

NLRIAIiLGGGKAWSDDTSQLGPDKHARDVPSLDKYAEERWEWL 

HFMVGSPSAAVSQDLAQI,r,SQAGI»MKSTEPGEPPCITSAGFQFL 

liLDTPAQLWYFMIiQYLQTAQSRGMDLVEILSFLPQLSPSTLGKD 

YSVEGMSDSLI^NFLQHLREPGLVPQRKRKSRRYYPT/RAIAINL 

SSGVSGAGGTYHQPGPI^^\^/ETNyRLYAYTESEIuQIALlAXjFSE 

MLYPPP\NMW\ARVTR\ESVQQArASGITAQQIIHFI.RTRAHP 

VKLKQTPVLPPTITDQIRLWEIiERDRIiRB-rEGVLrNrQFLSQVPF 

ELL\LAHAPKLGVLVFE/JITPAKRLMWTPAGHSDVKRFWKRQK 

HSS 


6080 


1 


1193 


ietidhvgefamaaqaagvsrqraatO^lgsnokalkylgqdfk ■ 

TLRQQCLDSGVLFKDPEFPACPSALGYKDLGPGSPQTQGIIWKR 
PTELCPSPQFIVGGATRTDrCQGGLGDCWLLAAIASLTLNEEXiL 

yrwprdqdfqenyagifhpqplcpps?\fwqygewvewiddr 
lptkngqltiflhseqgnefwsallekayaklngcyeaiiaggstv 

EGFEDFTGGISEFYDLKKPPANLYQIIRKAIiCAGSriUSCSlDVy 

SAAEAEAITSQKIiVKSHAYSVltSVEEVNFQGHPEKLIRLRNPWG 

EVEKSGAWSDDAPEWI^HIDPRRKEELDKKVEDGEPWMSLSDFVR 

QPSRLEICNLSPDSLSSEEVHKWNLVLFNGHWTRGSTAGGCQNY 
PGSS 


6081 


3 


865 


EMLPLIiLPLPLLWA/GAIiAQDARFRliEMPESVTVQEGLCIFVHC 
S VFYI»E YGWKDS TPAYGHW FREGVS VDQETP VATNNSTQKVQKE 
TQGRFHLLGDPSRWNCSLSIRDARRRDWGSYFFWVARGRTKFSY 
KYSPLSVYVTALTHRPDILIPEFI^KSGHPSNLTCSVPWVCEQGT 
PPIFSMMSAAPTSLGPRTLHSSVLTIIPRPQDHGmLICQVTFP 
GAGVTTERTIQLSVSWKSGTVEEWVLAVGWAVKILLUn^CLI 
ILSFHKKKAVRAVEVEENVYAVMG 


6082 


283 


1268 


EARSPQPTQrRTAPGLAAPGLAQPAALRLLLSRPPSAAMDGDGD 
PESVG0PEEASPEEQPEEA5AEEERPEDQQEEEAAAAA\Y\liDE 
LPEPt.IA/LRVIiAAItPRHE\LVQACR\LVCLRWKELVDGAPI,WL 
LKCCX3EGLVPEGGVEEERDHWQQFYPLSKRRRNIJ:,RNPCGEEDL 
EGWCDVEHGGDGWRVEELPGDSGVEFTHDESVKKYPASSFEWCR 
KAQVIDLQAEGYWEELLDTTQPAIWKDWYSGRSDAGCliyEIiTV 
KLLSEHENVIAEFSSGQVAVPQDSDGGGWMEISHTFTDYGPGVR 
FVRFEHGGQDSVYWKGWPGARVTWSSVWVEP ^ 


6083' 


1865 


309 


KQWCAERRGLGtyiSLADELJLADIiEEAAEEEEGGSYGEEEEEPAlS 
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SEQ 
ID 
NO: 



Predicted ~ 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 

nucleotide 

location 

CO r respond! ng 

to first 

amino acid 

residue of 

amino acid 

sequence 



Amino acid segment containing sxgnai peptia^ 
(A_Alanine, C^Cysteine, D^Aspartic Acid, E«= 
Glutamic Acid, F=Phenylalanine, G-Glycine, 
H-Hxstxdine, I=Isoleucine, K^Lyaine, 
L=Leucine, M^Methionine, N«Asparagine 
P=Proline, Q=Glutaraine, R^Arginine, 
S-Serine, Ts= Threonine, V=Valine, 
W=Tryptophan, Y-Tyrosine, X^UnkAown, *=story. 
Codon, /=possible nucleotide deletion. 



6084 



ises 



309 



"6086" 



2419 



6087 



476 



\-possible nucleotide insertion) 
i^vwKfc:TgLDI>SGDSVKTl AKLWi;aKMFAEIMMKIEEYISKQAK A- 
SEVMGPVEAAPEYRVIVDAWNLTVEIENELNIIHKPIRDKYSKR 
FPELESLVPNAIJ3YIRT\^ELGNSW3KCKNNE^I^ILTmTlM 
WSVTASTTQC3QQLSEEELERLEEACDMAI^LNASiCHRIYEYV£ 
SRMSFIAPNIiSIIIGASTAAKIMGVAGGLTNLSKMPACNIMLLG 
AQRKTLSGFSSTSVLPHTGYIYHSDIVQSLPPXPPPPSVAPN^ 
lOiKAARLVAAKCTLAARVDSFHESTEGKVGYELKDEIERKPDICW 

qepppvkqvkplpapldgqrkkrggrryrkmkerlglteirVko 

ANRMSFGEIEEDAYQEDLGFSLGHLGKSGSGRVRQTOVNEATKA 

risktlqrtujkqsvvyggkstirdrssgtassvaptplqglei 

yNPQAAEKKVAEA^^QKYFSSMAEFLKVK GEKSGLMST 

KgwCAERRGlAiMSi^EblADLEEAAEEEEGGSYGEEfeEEPAIE" 
DVOEETOLDT.snn.cjUTr'PT a vr.trrvovMc-ivwr 



*x« " v_^i^,v^iAjw&i^iii,jUADLEEAAEEEEGGSYGEEEEEPAIE' 
DVQEETQLDLSGDSVKTIAKLWDSKMFAEIMMKIEEYISKQAKA 
SEVMGPVEAAPEYRVIVDANNLTVEIENELKI IHKFIRDKYSKR 
frfi^P^^^''^^^^^^*^^NSLDKCKNNEm^ILT^^^^ 
WSVTASTTQGQQLSEEEI.ERLEEACDMAI.ELNASKHRIYEYVE 
I SI^SPIAPNLSIIIGASTAAKlMGVAGGI.'mLSKMPACNIMLLG 
j AQRKTLSGFSSTSVLPHTGyiYHSDIVQSI.PPIPPPFSVAP\X)b 
RRKAARLVAAKCTIJ^VDSFHESTEGKVGYELKDEIERKFDKW 
QEPPPVKQVKPLPAPLDGQRKKRGGRRYRKMKERLGLTEIRXKQ 
ANRMSFGE1EEDAYQED1,GFSLGHLGKSGSGRVRQTX}VKEATKA 
RISKTLQRTLQKQSWYGGKSTIRDRSSGTASSVAFTPLQGI.EI 

- VNPQAAEKKVAEANQKYFSSMAE FLKVKGEKSGLMST 

^4SS I ^ggRSFQGNKAVGRISLGG KRNPEVTLLPGVSSERVRRWRRARV 



w.w^.^,^* v-j.,xvvv*^ttii,i^t^tU«ij'BVTIiIjPGVSSERVRRWRRARV" 
GVARVKPGNPWKPSPATQVPR/VPAQVYLPGRGPPLREGEEIiVM 
DEEAYVLYKRAQTGAPCLSPDIVRDHI^DNRTErLPr,Tr,yLCAGT 
QAESAOSNRLMMLRMHNLHGTKPPPSEGSDEEEEEEDEEDEEER I 
KPOLELAMVPHYGGINRVRVSWLGEEPVAGVWSKKGOVEVPAUl 
RLLQWSEPQAIiAAFXJRDEQAQMKPIFSPAGHMGEGPALDWSPR K 
VTGRIiLTGOC^KNIHLWTPTDGGSWHVDQRPPVGHTRSVEDLOW ' 
SPTENTVFASCSADASIRIWDXRAAPSKACMLTTATAHDGDVW 
ISWSRREPFLLSGGDDGALKIWDLRQFKSGSPVATFKQHVAPVT 
SVEWHPQDSGVFAASGADHQITQWDLG/IVERDPEAGDVBAD^G 
LADLPQQLLFVHQGETEIiKELHWHPQCPGLLVSTALSGFTIFRT 



^^^7 (iAATQHGGAMMLJUPCNPHGM GLLYAGFNQDHGCPACGMEKGFRv 

I TOTDPLKEKEKQEFLEGGVGHVEMLFRCMYLALVGGGKKPKYPP 

WKVMIWDDLKKKTVIEIEFSTEVKAVKLRR\DKIVVVLDSMIKV 

PTFTHNP\HQLHVPE\TCYNPKGLCVI.CPNSNNSLLAFPQTH'rG 

HVQLVDLASTEKPPVDIPAHEGVLSCIADNLQGTRXATASEKGT 

LIRIFDTSSGHLIQEXiRRGSQAANIYCINFJTQDASLlCVSSDHG 

TVHIPAAEDPKRNKQSSLASASFLPKYFSSKWSFSKFQVPSGSP 

CICAFGTEPNAVIAICADGSYYKFLFNPKGECIRDVYAQPLEMT 
DDfClj 



6088 



1684 



W«^0KTGX.P1TIFSRSFPLL TGSDLCEMMPCTCTWRNWRQWIRP 
LVAVIYLVSIWAVPLCVWELQKLEVGIHTKAWFIAGIFLLI^TI 
PlSLWVILQm.VHYTQPELQKPI IRILWMVPIYSLDSWIALKYP 
I GIAIYVDTCRECYEAYVIYNFMGFLTNYLTNRYPNLVLIIUEAKD 
QQKHFPPLCCCPPWAMGEVHiFRCKLGVLQYTWRPFTTIVALI 
CELLG lYDBGNFSFSNAWTYLVI INNMSOLFAMYCriLPYKVLK 
EELSPIQPVGKFLCVKLWFVSFWQAWIALLVKVGVISEKHTW 
EWQTVEAVATGLQDFIICIEMPLAAIAXHHYTPSYKPYVQEAEE 
GSCFDSFLAMWDVSDIRDDlSEQVRHVGRTVRGHPRiCKLFPEjDO 
DQKEHTSLLSSSSQDAISIASSMPPSPMGHYQGFGHTVTPQTTP 
TTAKISDEILSDTIGEKKEPSDK SVDS *^UiAf 

\ ^^^^VRLI^QGHRCLIAPVA PKLVPPTOGVKKGFRAAFRgnKB 
I LERQRbLRCPPPPVRRSEKPNWDYHAEIQAFGHRIiQENFSLDLL 
KTAFVNSCYIKSBEAKRQQLGXEKEAVLUn.K:SWQELSEQGrSF 
SQTCLTQFLEDEYPDMPTEGIKKLVDFLTGEEWCHVAiiNLAVE 
QLTI^BEFPVPPAVLQQTFPAVIGALLQSSGPERTALFlRDFLlI 
TQMTGKELPEMWKIINPMGLLVEELKKRNVSAPESRLTROSG\A 
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1 SEQ 
ID 
NO: 


predicted 
beginning 
nucleotide 
location 
correspond ing 
to first 
amino acid 
residue of 
amino acid 
sequence 


f reaicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acid segment containing sxgnal pepti^^ 
(A=Alanxne, C^Cysteine, D-Aspartic AcidT 
GlutamLC Acid, F- Phenyl alanine, G^Glycine, 
H-Hictidme, I-Isolcucine, K=Lysine, 
L=Leucine, M-Methionine, N^AsparagiAe, 
P-Prolme, Q^Glutamine, R«Argitxine, 
S-Serine, T=Threonine, V-Valine 
W-Tryptophan, Y-Tyrosine, X=UnkAown, *=Stop 
coaon, /^.possible nucleotide deletion 
\=possible nucleotide insertion) 


60Q9 






PTALPLYFVGLyCDKKLIAEGPGEfWAA^^ 
TENRRPWNYSECPKETLHAEKSITAS ^^r^^t 


1 6090 


3 


30S4 


TKI^IPGSTISSRPRLCMAAKGHFLGHSWTGSRAGAHTGAPXW" 

PSRRLRDLPAGGMWRLRRAAVACEVCX3SLVKHSSGIKGSI.PL0K 

LHLVSRSIYHSHHPTLXI^RPQLRTSFCXJFSSLTHLPLRK^^^ 

PIKyGYQPRRNFWPARIATRLLKLRVI.ILGSAVGGGYTAKKTPD 

QWOMIPDLSEYKWIVPDIVWEIDEYIDFEKIRKALPSSEDLVK 

LAPDPDKIVESLSLLKDFFTSGSPEETAFRATDRGSESDKHFRK 

vsdkekidqlqbellhtqlkyqrilerlekenkelrklvSkdd 

KGIPFIESLRKSr.IDMYSEVLDVIiSDYDASYNTQDHI,PRWWG 
DQSAGKTSVLEMIAQARIFPRGSGEMMTRSPVKVTLSEGPHHVA 
*j i * f vu 1 JVC isiiiiAALtRHE I EL RMRKNVKEGCTVS PETI S 

dpnaiilciqdgsvbaersivtdlvsqmdphgrrtifvltkvdS 

AEKKVASPSRIQQIIEGKLFPMKALGYFAVVTQKGNSSESIEAI 
REYEEEPFQNSICLLKTSMLKAKQVTTRNt.SrAVSDCFWKMV^S 
VEQQADSFKATRFNLETEWKNKYPRLRELDRNEI.FEKAKNEILD 
EVISI^SOVTPKHWEEILQQSLKERVSTHVIENIYLPAAQTMilSG 
TFMTTVDIKLKQWTDKQLPNKAVEVAWETLQEEPSRPMTSPKGK 
EHDDIFDKLKEAVKEESIKRHKWNDFAEDSLRVIQHNALEDRSI 
SDKQQWDAAIYPMEEALQARLKDTENAIEN^fVGPD\WKKRWl.YW 
KNRTQEQCVHNETKNELEKMI^CNEEHPAYLASDEITTVRKNI^ 

V *'v*''-'3«AJNaji.i««yvXKKHFLKTAIjNHCNLC!IRRGF 
FVDSEI^ECNDVVLFWRIQRMIAITANTLRCXJLTm'EVHRI^KNV 
KEVLEDFAEDGEKKIKLLTGKRVQIJ^nKK^ 
ALHQEK 


6091 


194 


1560 


PVtvyiU^tjAVl.EQAS/ASPPIATQTWt>UjHCltlPEI.PVQA3ii;"" 

FEIKJLFFCQLIALFVHYINIYKTVWWYPPSHPPSHTSLNFHLID 

FNLLMVTTIVLGRRFIGSIVKBASQRGKVSLPRSILLFLTRFTV 

LpiTGWSLCRSLlHLBllTySFLNLL/FPLr^VWDVHSVPAAEIJl 

P\RKTSLFNHMASMGPHEAVSGLAKSRDYLLTLR\RRG5STQDS 

CMARTPCP/PHACCt^PSLIRSEVEFlJCMDFWWRMKEVLVSS^^ 

SAYYVAm'WFVKNTHYYDKRWSCELFLLVSISTSViriMQHLL 

PASYCDLLHKAAAHLGCWQKVDPALCSNVIiQHPWTEECMWPQGV 

LVKKSKNVYKAVGH YNVAI PSDVSHPRFHFPFSKPLRtLNII,LL 

VLGKAYSYSASPQRDLDHRFS 


6092 


3279 


412 

] 
1 

C 

; 
I 

c 


SSRTkiLMbKKKlLRROIRLLQGLIDDYKTUIGNAPAPgTPAASG 
MQPPTYHSGRAFSARYPRPSRRGySSHHGPSWRKKYSLVNRPPG 
PSDPPADHAVRPLHGARGGQPPVPQQHVLEROVQLSQGQNWIK 
VKPPSKSGSASASGAQRGSr^EFEDTPWSDQRPREGEGEPPRGQ 
LQPSRPTRARGTCSVEDPIiLVCQKEPGKPRMVKSVGSVGDSPRE 
PRRTVSESVIAVKASFPSSALPPRTGVAL6RKLGSHSVASCAPO 
LLGDRRVDAGHTDQPVPSGSVGGPARPASGPRQAREASLWTCR 
TNKFRKMNYKWVAASSKSPRVARRALSPRVAAENVCJfASArMaM 
KVEKPQLIADPEPKPRKPATSSKPGSAPSKYKHKASSPSASSSS 
SFRWQSEAGSKDHASQDSPVLSRSPSGD\RPAVGHSGI*KPLSGE 
<ruc>«iivvrvjst<.xjMiKKKG5TSLPGDKKSGTSPAATAKSHLSLR 
RRQAIilGKSSPVI,KKTPNKGLVQVTTHRLCRLPPSRAHLPrKEA 
SSLHAVRTAPTSKVIKTRYRIVKKTPASPLSAPPFPLSLPSWRA 
^RLSLSRSLVLNRLRPVASGGGJCAQPGSPWWRSKGYRCIGGVLY 
CVSANKLSKTSGQPSDAGSRPLLRTGRLDPAGSCSRSIASRAVQ 
?SLAIIRQARQRREKRKEYCMYYNRFQRCMRGERCPYIHDt>EKV 
WCrrRPVRGTCKKTDGTGPFSHHVSKEKMPVCSYFXKGlCSNSN 

:pyshvyvsrkaevcsdflkgycplgakckkkhtij:.cpdfarrg 

^CPRGAQCQLLHRTQiamSRRAATSPAPGPSDATARSRVSASHG 
'RKPSASORPTRQTPSSAALTAAAVAAPPHCPGGSASPSSSKAS 
'SSSSSSSPPASLDHEAPSLQEAALAAACSNRLCKLPSPISLQS 

pspgaqprvraprapltkdsgkplhikprl 




143 


3190 / 
B 


JCAPPTGESSEPEAKVZoTTKRLYRAVVEAVHRX^DIilLCMKTAYC?* 
VFXPBNISLRWKlJiEI.CVia>MFr.HPVDYGRiCABBLLKRKYyyE 
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SEQ 
ID 
NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C^Cysteine^ D-Aspartic Acid/ E« 
Glutamic Acid, Fs: Phenylalanine, G-Glycine, 
H=Histidine, I=Isoleucine, KatLysine, 
Ij=!Leucine, M-Methionine, N^*Asparagine, * 
P-Proline, Q-Glutamine, R=*Arginine, 
S=iSerine, Ta^Threonine, V=:Valine, 
W=Tryptophan, Y^^Tyrosine, X^Unknovna, *^top 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








VlQLIKTNKiCHiHSRSTriECAVT^THLVAGIGFyQHLLLYiaSHY 
QLELQCCrDWTHVTDPLIGCKKPVSASGKEMDWAQMACHRCLVY 
LGDIiSRYQNEIAGVDTELLAERPYYQALSVAPQIGMPFNQLGTl* 
AGSKYYNVEAMYCYLUCIQSEVSFEGAYGNLKRIiYDKAAKMYHQ 
LFaCCETRKLSPGKKRCKDIKRLLVNPMYLQSLUJPKSSSVDSEL 
TSltCQSVLE DFNLCLF YIjPS S PNLS IiAS EDEEE YESGYAFLPDL 
t*XFQMVIICl«MCVHSLERAGSKQYSAAIAFTriAr,PSHliVNHVNI 
RLQAELEEGENPVPAFQSDGTDEPESKEPVEKEEEPDPEPPPVT 
PQVGEGRKSRKPSRLSCLRRRRHPPKVGDDSDIiSEGFESDSSHD 
SARASEGSDSGSDKSLEGGGTAPDAETDSEMNSQESRSDIiEDME 
EEEGTRSPTT.EPPRGRSEAPDSIINGPU5PSEASIAS1JX.QAMSTQ 
MPQTKRCFRIAPTFSIsnijiaiQPTTNPHTSASHRPCVNGDVDKPSE 
PASEEGSESEGSESSGRSCRNERS I QBKLQVIiMABGliLPAVKVF 
ItDWLRTN PDLl XVCAQSSQS LMNRIiS Vu 

LCPEVQDLLEGCELPDLPSSIililiPEDMALRNLPPLRAAHRRFNF 
DTDRPIiLSTliEESWRXCCIRSFGHFIARLQGSILQFNPEVGIP 
VS I AQS EQESIiliQQAQAQFRMAQEEARRNRI^MRDMAQIiRXKJXiEV 
SQI.EGSLQQPKAQSAMSPYIiVPDTQAIiCHHI.PVIRQI»ATSGRFI 
VI I PRTVIDGI^DLLKKEHPGARDGIRYLEAEFKKGNRyiRCQKE 
VGKSFERHKLKRQDADAVrriiYKirjDSCKQLTKlAQGAGEBDPSG 
MVTIITGI*PIiDNPSLLSGPMQAAIiQAAAHASVDIKN\nbDFYKQW 




76 


1002 


ACGRRAWIALRVART/SRWGAI^XRGAVWAPGTRPSKRRACWAI/L 
PPVPCCLGCLAERWRLRPAl^IiRIiPGIGQRNHCSGAGKAAfeX 
PAAGAGAAA]5yiiPGGQWGPASTPSLYENPWTIPNMIiSMTRIGI*?UE> 
VLGYLI lEEDFMlALGVFAIAGIiTDIiLDGFIARNWANQRSALGS 
AIJiPLADKILIS IIjYVSbTYADIilPVPLTYMI ISRDVMlil AAVP 
YVRYRTLPTPRTLAKYFNPCYATARLKPTFISKVNTAVQLXLVA 
ASIiAAPVFNYADSiyiiQILWCFTAFTTAASAYSYYHYGRKTVQV 
IKD 


G094 


23 


1010 


PFI^RCLRGDQlOiKMSERKVLNKYYPPDFDPSKIPKLKIiPKPRQY 
VVRLMAPFNMRCKTCGEYIYKGKKPNARKETVQNEVYliGLPIFR 
FYI KCTRCIaAE ITFKTDPENTDYTMEHGATRNFQAEKl.l»EEEEK 
RVQKEREDEEIxNNPMKVLENRTKDSKLEMEVLENLQErtKDIiNQR 
QAHVDFEAMLRQHRIjSEEERRRQQQEEDEQETAAIiLEEARKRRL 
XiEDSDS EDEAAPS PLQ PAbRPWPTAlIiDEAPKPKRKVEVWEQSV 
GSLGSRPPIiSRLVWKKAKADPDCSNGQPQA/APHPRSPAEQEG 
GQPYTPDAWRVIiPEPTGCI PGQ 


609S 


1 


1553 


TRGRAAERSRGRGHGFIiGGGFAXSWDYFPSEDFYRCGYCKNES 
GSRSNGMWAHSMTVQDYQDLIDRGMRRSGKYVYKPVMNQTCCPQ 
YTIRCRPLQFQPSKSHKKVLKKMr,KFI»AKGEVPKGSCE\DEPMD 
STMDDAVAGDFALINKIiDIQCDLKTLSDDIKESIiESEGKNSKKE 
EPQELLQSQDFVGEKLGSGEPSHS 



TRADOCS:I4 16257.K%CSH01 !.DOC) 



445 



BNSDOCID: <W0 01533i2Al_L> 



wo 01/53312 

PCT/USOO/34263 



SEQ 
IP 
MO: 



PredicteH ~ 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



"Predicted end" 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



2277 



675 



e097 



1673 



192 



6098 



168 



6099 



168 



"6100" 



1074 



1074 



713 



"A^i-no acxd segment coHt^- xning signa T^^tlH^ 
(A=Alanxne, C=Cysteine, D^Aspartlc Ac?d 
Glutamac Acid, F^Phenylalanine, G=Glyci;e 
H=Histidxne, I^Isoleucine, K=Lysine/ ' 
L=Leucine, M=Methionine, N-Asparagine 
P==Prolane. Q^Glutamine, R»Arginine, ' . 
S^Serxne, T«Threonine, V^Valine, 
W^Tryptophan, Y^Tyrosine, X^Unknown, * 
Codon, /^possible nucleotide deletion 
\ ^poss3.blg rmcleotide lnser^irmi 



=Stop 



. — iiu^^xcoi^jLae insertion) ^ 

WRSSPPSSQPKATI^ESVQVYKRyQMVIHKNPPDTPTE^^ 
MGFYrHSCPKMKYKCK2YRPSDI,LCPETYVWV?IEQ^ 

ycrfnqdpeavdedrstepdrlqvfhki^impygwkk^Sps 

.EBA AVLQYASLVGQKCSERMLL FRN ^KKQQKDPS 
QRVRAAi.L^i:;AMEDSEALGF EHMGLDPRLLQAVTDlk3WSRPTLT 

qekaipuu.egkdllarartgsgktaavaipmlqS^^ 

WEQAVRGLVLVPTKBIARQAQSMIQQIA^^^S^ 
LSLIRGKSLLFVNTLERSYRLRLPLE0PSTPTrVT.wn_OT;vi±^ 



u«±i:sUtWQGFYDa^IATDAEVLGAPVKGKRRGRGPKrnirren; 

I^FVLPTEOFHIXSKIEELLSGEmGPILI^VQFRM^GPRYR 
CRDAMRSVTKQAXREARI,KElKEELLHSEKLi^^^^^of^.^S 




AKiTMSGGKKKSSFQITSV TTUyEGPGSPGASDPPTPOPPTGPP " 
RELftBRNAALEQENGLI,RAIA\SPEQLGSAaPPRGVl.D\T^S> 



Kr^^?^^^''^''^'^^^^^^^^«^T^^AYSPKRSPKE 
NLSPGFSHLr^KKESSPIRFDrLLDDLDTVPVSTLQRTNPRKOI* 
\QFLPLDDSEEK\TYSEKAT\DMIVNHSSCPEPVPNGVKKVSVR 
TAWEKNKSVSYEQCKPVSVTPQGNDFBYTAKtRTIAET^F^ 

^^VT^^i^^^^^^^^^^^^^ ^^^^^^^'^'^t^lPIMRALKEJbD 
pf^on^r^^^^°^^"^^^^Q^SVNASRSPEKCAQQRQK 
RUTSASQRSSSLPPSNRKSSTPTKREIMLTPVTVAYSK^PKE 
in.SPGFSHI.LSKWESSPIRFDILLDDI^TVPVSTLQRTNPRKOL 
\QFI.PI.DDSEEK\TySEKAT\DNIVNHSSCPEPVPNGV^^ 

TAWEKNKSVSYEOCKPVSVTPQGMDF^YTAKIR^S\^ 

E LTlgKDQlEAALSRHPSPGGRITL OTRmQF;;.^ 

FVE V3Q YKSRADPEPRGRDTMl .AiLb 1 1 IGDTGVGKSCUaL 

pIT^^^^^^^'^^^'^'^^'^w^^^kqiklqiwdtagqesf 

,™7*?^^^^^^^^'^*^*^S'^^'^TSWLEnARQHSSSNM 

yio^i^'^^^^^^^^^^^^^^^^NH^^^IFMETSAKTACN 

VEEAFINTAKEIYRKIOQGLFDVHNEANGIKIGPQQSISTSVGP 
SASQRNSRDIGSNSGCC 

i-KURAWPLREVSHWI.GCRRy C^WSASWGKX.PALSARLSPI,LAFR ■ 
GKMVFPLSCAVQQYAWGKMGSNSEVAHLLASSDPLAQIAEDKPY 

AELWMGTHPRGDAKXI.DWRI SQKTLSQWIAENQDSLG5KVIcnTT. 

— . ^ 
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ID 
NO: 


predicted 
beginning 
nucleot ide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
3cq[Uencc 


Amino acid segment containing signal pepti-S^ 
(A^Aianine, C=Cysteine, D^Aspartic Acid. E= 
S^^^r^S.^''^'*' f=Phenylalanine, G=Glycine, 
H^Histidine, I=lsoleucine, K=Lysine 
L=::.eucine, M=.Methionine, N-Asparagine , 
P«?roline, Q^Glutamine , R^Arginine, 
S«Sarxne. T=.Threonine, V«Valine 
W^Tryptophan, Y^^Tyrosine, X^Unknown, *«Stop 
Codon, /=:possible nucleotide deletion, * 
\=:possable nucleotide insertion) 


6102 






kpemaialtpfqglcgftjpveeivtflkicvpbfSfligdS^ 

LKQTMSHDSQAVASSLQSCFSHZ^KSBKKVVVEOI^LV^SQ 

QAAAGNNMEDIFGBLLI^UICXJYPGDIGCPAIYMLLTLKPGE 

AMFLEANVPHAYLKGDCVECMACSDtmOUVGLTPKFID^^ 

MLSYTPSSSKDRLFLPTRSQEDPYt^SIYDPPVPDFTIMKAXEVP 

G\SVTEYKDIALDSASILrj4VQGT\aASTPTTOTPiPiriR^ 

FIGANESVSLKLTEPKDLLIFRACCLL ^"^^^^^^^^^ 


ei03 


70 


2415 


QTPQATIAAMGAKUSRGGEMLPAGSIGASPAAPCCSESGDERI^-" 
LEEKSDim^TVLIGSKQVSEGTDKGDLPSYVSAPTFvpvr-K^T^ 

SLKKU3KLIEQRTVSKMQLEEQVLTISSBIPKRIRSM^KNAEES 

HIAYL!OTISQrEBLSDNI<X!YLMTN>IVPBAASTLVSMABLDlKl! 
OESSCTHLLGFMRATVKFWHKIIiKDKLTSDrEEILAQLHWPPIA 

LMMl,VLEKlATDIPCLLyDDtn,FCHtVt)EVLLFERBLHSVHGYP 

INLLBiUJCHDMl.TRQVDHVFRBVKDAAKLYKKERWLSLPSQSEO 
KHIKEACIVLHU)VOSALrAGKDVI,PVOLOeSPBftT , " 


6104 


207 
124 


2S23 


m.ftpi,kerpdlppj:qyepvlcsr7tcravlnplcqvdyr^h 

^f^°'^°'"^^^''''°^^="'QPAEI.LPQFSSrEywi,ROPQM 
PI.iri,YWDTCMEDEDLQALKESMQMSMLLPPTAI,VOLIrEtaj 
MVQVHELGCEGISKSyVPRGTKDr.SAKQLQEMI«LSKVPVTQAT 

^^f^°^S^^^'^^ECTFPOTGARIMMFIG5pATQ0PoS^ 
DELK. PIRSWHDIDKDNAKYVKKGTKHPEALANRAATTGHVIDI 
YACflIJ3QTGI,LEMKCCPNLT(3GYMVMGDSFNTSI,FI«3TFQRVFT 
KDMHGQPKMGPGGTLElKrPRXEIKlSGAIOPCVSuisKGPCVS 
f^'?n^^3f°'"'^'^"^"^^'^^Q««^"QGG\RG 

"'^^^^'"^^^^®P'°^'**'I'DRQI'IRLCQKFGEyHK 
DrasSPRFSETFSLYPQFMFHLRRSSPLQVFNNSPDESSVYRHH 
FMRQDI,TQSLIMIQPILyAYSFSOPPEPVU:,DSSSirADRir,LM 

^^??^I^^'^^^''''^"'^^°*^'^'™»PSQTHNNMYAWGQES 
jAPILTDDVSLQVFMDHLKKIAVSSAA 


6105 




732 

; 


Jr^^Q»^^^™I^^''^"*'^''''^'^^'''''SA^GVti*f™ERLLR 
Dl?™M^rn^^°^'^''°^*°^°«SQQMIPVEVKRI/RSL 
'™?^'™^^^^^=^«5KLIKVKNNIDVCPECGH 




3 


989 I 
C 

Y 

1 ^ 
1 ^ 


>LHGAU6LyUiRtCHRl<PRPCAPARPEDMi«?PAA^^ 
rSQRAKAATACGRPRMUORIWGGQDTQEGEWPWQVSrQRNGSHF 

SVIFETGMKaWXSMGSPSEEDLLPEPRILQKLAVPIIDnPR 
LVGQSWLQAGVISWGEGCARQNRPGVYIRVTAHHNWIHRIIPK 



— _^ 
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SEQ 
ID 
MO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 


Amino acid segment containing signal pept£d^ 
{A=Alanine, C=Cysteine, b»Aspartic Acid, E- 
Glutamic Acid, F= Phenyl alanine, G=Glyciae, 
H=Hi3tidine, I^Isoleucine, K-Lysine, 
L-Leucine, M=Methionine^ N=Asparagirte, 
P=Proline. 0=Gautamine, R=:Arginine, ' ■ 
S=Serine, T=Threonine, V=Valine, 
w=Tryptophan, Y=Tyrosine, X^rUnknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 
LQVQPSEVGRPEVTPPGPGAP ^- 


S106 3 


1302 


GRPPTAPHTGRPPTANRGDPRLDLKRGCARLLTSIESRGRPAAS 
AGLRRDRCALRRMPLRRAPLARATRRRAGSPRRCAPRPRACPQG 
! WSRARHQPGGLCLLLLLLCQFMEDRSAQAGNCWLRQAKNGRCQV 
LYKTELSKEECCSTGRLSTSWTEEDVNDNTLPKMMIFNGGAPNC 
IPCKETCBN\'DCGPGKKCRMNKKNKPRCVCAPDCSNITWKGPVC 
GLDGKTYRNECALLKARCKEQPELEVQYQGRCKKTCRPVFCPGS 
STCVWDQTNNAYCVTCNRICPEPASSEQYLCGNDGVTYSVSAC 
HLRKATCLLGRS IGrAYEGKCIKAKSCEDIQCTGGKKCLWDPKV 
GRGRCS LCDELCPDSKS DEPVCASDNATYASECAMKBAACSSGV 
LLEVKHSGSCNSISEDTEEEEEDEDQDYSPPlSSXIiEW 


6107 623 


i^a 


i^RCSSPRPEPGRGRGK/LSPSEHRKWVEVFKACDEDHKGYI^SRE 
DFKTAWMLFGYXPSKIEVDSVMSSINPNTSGILIiBGFLNIVRK 
KKEAQRYRNEVRHIFTAFDTyYRGFI,TIiEDFKKAFRQVAPKLPE 
RTVLEVFRE V\ DRDS\ DGHVSF 


6108 3 


1348 


GGSLRFSPPRVPS.CSRVFCPVPPGGCGLPSPMSASRPQSPTTPW 

CLPRRYMKHKRDDGPEKQEDEAVDVTPVMTCVPWMCCSMLVLL 

YYFYDLLVywiGrFCLASATGLYSCIAPCVRRLP\SASAGESA \ 

liAPTIPimSLPYPHKRPQARMLLLALFCVAVSVVWGVFRNEDQ 

MAWVLQDALGIAFCLYMLKTIRI,PTPKACrri.LLLVi:,FLYDIFPV 

FITPFUTKSGSS I MVEVATGPSDSATREKLPMVUCVPRLNSSPL 

ALCDRPFSLLGFGDiriVPGLLVAYCHRFDIOVQSSRVYFVACTI 

AYGVGIiLVTFVALATiMQRGQPAIiIiYLVPCTLVTSCAVAIiWRRBL 

GVFWTGSGFAKVIiPPSPWAPAPADGPQPPKDSATPLSPQPPSEE 

PATS PWPAEQS PKSRTSEEMGAGAPMREPGSPAES EGRDQAQPS 

PVTQPGASA 


6109 1 


1381 


L-Kb'KAGAASGGAILEGTKLRRQRVDrNKPLDPLVPSALRAAMlir^ 

LEDYLEMIEQLPMDLRDRFTEMREMDLQVCWMDQLEQRVSEFF 

MNAKKNKPEWREEQMASIKKDYYKALEDADEKVQIiANQlYDLVD 

RHLRKIiDQELAKFKMEr.EAI>NAGITEILERRSLEI,DTPSQPVWN 

HHAHSHTP VEKRKYNPTSHHTTrDHI PEKKFICSEALbSTLTSDA 

SKENTLGCRNKNSTASSNNAYNVNSSQPLGSYMIGSLSSGTGAG 

GI \TMAAAQAVQATAQMKEGRRTSSLKASYEAPKNNDFQLGKEF 

SMARETVGYSSSSAIJ^TTLTQNASSSAADSRSGRKSKNNNKSSS 

QQSSSSSSSSSLSSGSSSSTWQEISQQTTWPESDSNSQVDWT 

YDPNHPRYCXCNQVSYGEMVGCDTQDCPIEWFHYOCVGLTEAPK 

gkwtycpqctXaamkrrgsrhk 


6110 77 


2464 


ACPSAATMSDQDHSMDEMTAWKIEKGVGGNNGGNGNGSGAFSQ 
ARSSSTGSSSSTGGGGQESQPSPIiAItl.AATCSR lES PKENSNNS 
QGPSQSGGTGELDLTATQLSQGANGWQIXSSSSGATPTSKEQSG 
SSTNGSWGSESSKNRTVSGGQYWAAAPNLQNQQVLTGLPGVMP 
NI QYQVl PQFQTVDGQQLOPAArGAQVQQDGSGQiQII PGANQQ 
I ITNRGSGGNI lAAMPNIXQQAVPLQOLANNVLSGQTQyVTNVP 
VAIiNGNITLliPVNSVSAATLTPSSQAVTISSSQSQESGSQPVTS 
GTTISSASLVSSQASSSSFFTNANSYSTTTTTSNMGIMNFTTSG 
SSGTNSQGQTPQRVSGIiQGSDALNlQQNQTSGGSLOAGQQKEGE 
Q\NCX3TQAAPKST.SRPQrjVQGG\QALQ\AFQAAPLSG0TFTTQA 
ISQETLQMLQLQAVPWSGPIIIRTPTVGPWGQVSWQT3LQLQNU> 
VQNPQAQTITLAPMQGVSIiGQTSSSNTTLTPIASAASIPAGTVT 
VNAAQLSSMPGLQTINLSAU5T5QIQVHPI0GLPIAIAHAPGDH 
GAQLGLHGAGGDGIHDDTAGGEEGENSPDAQPQAGRRTRREACT 
CPYCKDSEGRGSGDPGKKKQHICHIQGCGKVYGKTSHLRAHLRW 
HTGERPFMCTWSYCGKRFTRSDELORHKRTHTGEKKFACPBCPK 
RFMRSDHLSKHIKTHQNKKGGPGVALSVGTLPLDSGAGSEGSGT 

ATPSALITTNMVAMEArCPEGIARLANSGINVKEGGQPCSPIKT 
SANGF 
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SEQ 
ID 
NO; 


Predtcted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C»Cysteine, D=A3partic Acid, E= 
Glu tannic Acid, F= Phenyl alanine, G=GXycine 
H=Histidine, Is^isoleucine, KsLysine, " 
Ii=Leucine, MssMethionine, N=:Asparagine, 
P-Proline, Q=:GXutaraine, R-Arginine, 
S -Serine, T=Threonine, V= Valine, 
W»Tryptophan, Y=Tyrosine, X«= Unknown, *ssStop 
Cadon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) ^ 








csqyglqnvdgemleevfhnldpdgtmsvedffyglfkngkslt"^ 

ddgmghasverildtwqeegiensqeilkatxifgldgninltel 
tlalenellvtkns ihqaci 


6745 


1 


588 


TFRDQGWAQRRRWLLGCASWES WEAAIAAGPGLPSSTAR{2QNNP ' ~ 
AAGTECFAAVWARGTAMGSVLS TDSGKS APASATARALERRRDP 

PT.PVTC J?tiC^\7r*T ^'Q\T1 .Vtr\T>\fX>'VX> r'f^tnrtfr'Tycu^'rri'vcir VKTKtvjjrnrt 
CiutrvX£>eu\,^v\^ur.Vijxi^tfvKl K^uJtlVx URSCIAXoitKNNKWTC 

P YCRAYLPS EGVPATDVAKRMKSEYKNCAECDTI.VCLSEMRAHI 

RTCQKYIDKYGPLQELEETA 


6746 


110 


492 


GATGAMAESAPARHRRKRRSTPLTSSTLPSQATEKS S YFQTTE I 
bLiW I V VAAXQAVbKKMESQAARLQSLtEGRTGTAEKKI^ 
VEFGNQLEGKWAVLGTLLQEYGIiLQRRIiBNVENIiI.RNRN 


6747 


247 


48 . 


EAVTFIOJVAWPTEEELGULDIiAQRKIiYRDVMLENFRNLI^VGH 
QPFHRDTFHFLREEKFWMMDIATQREGNSVYAGVC 


6748 


201 


665 


MTTFKEAVTFKDVAWFTEEELGLr^DPAQRKLYRDVMLENFRNL 
LSVGNQPFHQDTFHFLGKEKFWKMKTTSQREGNSGGKIQIEMET 
VPEAGPHEEWSCQQIWEQIASDLTRSQNSIRKSSQFFKEQDVPC 
QI EARLS ISXVQQXPYHCNECKQ 


6749 


95 


719 


RREVKGGDGVCPRARGSPQSQQFPSCAGGGEGLQQSGEAIiDGAH 
SAGGPCPAAAGGGPGGASCSVGAPGGVSMFRWLEVLEKEFDKAF 
VDVPLIiLGEIDPDQADITYEGRQKMTSLSSCFAQLCHKAQSVSQ 
INHKl4BAQI*VDLKSKI.TBa*QAEKWLEKEVHr)0LLQLHSIQriQI. 
HAKTGQSADSGTl KAKLSGPSVEELERELKAN 


6750 


3 


428 


SCESRRPGAKtWWASGALPRDTTGIiGSEQPSGDVAQSNRATMGT 
TAPGPIHLIiELCDQKLMEPLCNMDNKDLVHIiEEIQEEAERMFTR 
EFSKBPELMPKTPSQKNRRKKRRISYVQDKNRDPIRRRIiSRRKS ' 
RSSQL5SRR 


67SI 


IS 2 


1417 


PTKATEMAGASVKVAVRVRPFNSREMSRDSKCIIQMSGSTTTIV 
NPKQPKETPKSFSFDYSYWSHTSPEDINYASQKQVYRDIGEEML 
QHAPEGVNVCIFAYGQTGAGKSYTMMGKQEKDQQGIIPQLCEDIj 
FSRINDTTNDNMSYSVEVSYMEIYCERVRDLLNPKNKGNLRVRE 
HPLI/5PYVEDLSK1^VTSYND1QDLMDSGNKARTVAATNMNETS 
SRSHAVFNIIFTQKRHDAETNITTEFCVSKISrjVrHAGSERADST 
GAKGTRliKEQANimcSXjTTLGICVISAIiAEt^DSGPNKNKKKICKTD 
FX P YRDS VliTWLLREl^GGNSRTAMVAALSPAD I NYDETLS Tt«R 
YADRAKQIRCNAVINEDPWNKLIRELKDEVTRLRDIJLyAQGLGD 
ITDMTWAIiVGMSPSSSLSAliSSRJNV 


6752 


24 


1834 


RNCVPPLGCYRSRVKFHSD IKMQYSHHCEHUbERLNKQREAGFL 
CDCTIVIGBFQFKAHRNVLASPSEYFGAXYRSTSENNVFLDQSQ 
VKADGPQKIiLEPIYTGTLNIiDSWNVKEIHQAADYLiKVEEVVTKC 
KI KMEDFAFIANPSSTE ISS ITGNT'SIiNQQTCXiIiTLRDYNKREK 
SEVSTDLIQANPKQGAIJUCK5SQTKKKKKRFNSPKTGQNKTVQY 
PSDILElSASVELFLDANKIiPTPWEOVAQINDNSELELTSWEN 
TFPAQD I VHTVTVKRKRGKSQPKCALBaSHSMSNlASVKS P YEAE 
NSGEELDQRYSKAKPMCNTCGKVFSEASSLRRHMRIHKGVKPYV 
CHLCGKAFTQCNQLKTHVRTHTGEKPYKCBLCDKGFAQKCQLVF 
HSRMHHGEEKPYKCDVGNLQFATSSNLKIHARKHSGEKPYVCDR 
CGQRFAQASTLTYHVRRHTGEKPYVCDTCGKAPAVSSSLITHSR 
KHTGEKPFICELCGNSYTDIKmjKKHICrKVHSGADICrLDSSAED 
HTLSEQDSIQKSPLSETMDVKPSDMTI.PIALPIiGTEDHHMLLPV 
TDTQSPTSDTLLRSTVNGYSEPQLIFX^LY 


6753 


2 


1305 


VPSIiPYPPQKWAHTEFTTSSDSETANGIAKPDPVMPGGEEKAS 
PFGIKLRRTNYSIiRFMCDQQAEQKKKKRHSSTGDS ADAGP PAAG 
SARGEKEMEGVAIiKHGPSLPQERKQAPSTRRDSAEPSSSRSVPV 
AHPGFPPASSQTPAPEHDKAANKMPIAQKPALAPKPTSQTPPAS 
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SEQ" 
ID 
NO: 



Predicted 
: beginning 
I nucleotide 
I location 

corresponding 

to first 
( amino acid 

residue of 
I amino acid 

sequence 



6754 



6755 



298 



Predicted" end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



413 



1343 



180 



67S7 



6758 



6759 



754 



4SS 



1008 



SX3 



^mino acid segment^ntaxnxng eignar-SS^tlar-i 
(A=Alanine, ^Cysteine, D^Aspartic AcidrE- ' 
Glutamic Acid, F=Phenylalanine, G^Glycine " 
H«Hiatidine, I^Iaoleucine, K^L^sxne, ' 
L=Leucxne, M==Methionine, N=Asparagine , 
P==Proline, Q^Glutamine, R^Arginine, 
S=Serine, T=Threonine, v=Valiiie 
W.Tryptophan, Y.Tyrosine, X=tJnIcnown, *.stop 
Codon, /^.possible nucleotide deletion 
\=possible nucleotide insertion) ' v 

GPEERKGQKRDEEEEATERKPASPPLPATQQEbsQTPEAGR^ 

EERKQAREAKQAEKLSKENVSVSVQPGSSSVSRAGSLHKST^ 

EEKRPETAVSRLERREQLKKANTI.PTSVTVEISYSSPAAPLVKE 

VSKRFSSPDDAPVSSEPAWIALAKRKAKAWSDCPLIIK 

^•VK>U<RRRLGGk>KVNTMS5IJIKS RiAlJ^'UDVLKEPSlALEKmE - 

^f^fM^^^^'^^^^'''''^"*^^^^^^P^^WTSlLAKQRELYAQ 

PLREMIIQPOrAKANWGVSREDVTFEDHPLNPNPDSRWNTYFKD 
NEVIjXi 

i'GLQLQVALfcADWFJCDMPGGRRGPSRCXarii.'RaAXiPSrOTLVGGG 
CGNGTGLRNRNGSAlGLPVPPTT&T.T'rrir:nTmrr^*ri;, , 




Zr.IT ^^^^^^-tQYINiyKTVWWYPyWHPASCTSLNFHL 

IDYHIAAFITVMrJU^RLVWALlSEATKAGAASMIHYMV^ 

VLLTLCGWVLCWTLVNLFRSHSVLNLLFLGYPFGVYVPLCCFHO 

KEQFNWATPIPTHSCPLSPDLIRNEVECLKADFNHRIKEVLFMS 
LFSAYYVAFLPLCFVKVSGYLTFMC FLDIO/NYIWWVPT.V 

LRSLQPQPPGLKQSFCLRVrXSLQTGATTPQLRDLTCKELIII^TE 
REAQKRKKRKEKESGMALTQGPLTFRDVAIEPSQEEWKSLDPVO 

waRVEAPEflHSRESQGSD AMRKHLSWWWLATVCMLLFSHLSAVO 

TRGlKHRIKWlSIRICAT.PflT&nTTWTi/^t?ftr.XTi^r»r^»T,-P,,,. 




AAWQGSFQKPDNXLHQQVI.W 

AbGPKZ,PGRi^'RDRAPWLPARLLRGVLAVWVaLSAl/3PGSFCRR 
RVPSLAQtiGHSEAAPSPDDVRWSRVPDRCPEERDRAMPPPPPPS 
LPPSFRRNMANNSPALTGNSQPQHQAAAAAAQQQQQCGGGGATK 
PAVSGKQGNVtiPLWGNEKTMNLNPMILTNILSSPYFKVQLYELK 
TYKEWDEIYPKVTHVEPWEKGSRKTAGQTGMCGGVRGVGTCGI 
VSTAFCLLYKLPTLKLTRKQVMGLITHTDSPYIRALGPMYIRYT 
QPPTDLWDMFESFIJDDEEDLDVTCAGGGCVMTIGEMLRSFLTKLE 
WFSTLFPRIPVPVQKNIDQQIKTRP RKI 

KKHNgHSLDQxaTRAFHPQ TGLPLLSSPVP^jKK'IXiSGCFDLDSS " 
LLHLKSFSSRSPRPCLNIEDDPDIHEKPFLSSSAPPITSLSLLG 
WFEESVLNYRFDFr.GrVDGFTAJErVGASGAFCPTHLTLPVEVSFY 
_SVg DDNAPSPYMGVITLESLGKRGYRVPPSG TIQWCVL 

VLSKKKCITi.QJiPTr.tfPrPPMMC'Tr'gtTamtrT-tTrT^ ^ ' ^i l V ' .jH^^ ' . !~ 



6761 



29 



1733 



VIiSKKKGLSAEEKRTRMMEIFSET KDVFQLKDI^KTAPKSKGIT 
AMSVKEVLQSLVDDGMV3X:ERIGTSNrYyMAPPSKAI,HARKHKr.E 

vlesqlsegsqkhaslqksiekaki grceteert 

liKTL,RGLREvAAPSDVADA AVSRRGRCCCCLHt:iX3TQVAQDCPS 

ssssvqrcelslfqslhtmtskklvnsvagcaddalaglvacnp 
nlqllqghrvalrsdldslkgrvallsgggsghepahagfigkg 

MLTGVIAGAVFTSPAVGSIIUAAIRAVAQAGTVGTLLIVKMYTGD 
RI^FGtJU^EQARAEGIPVE^m^lGDDSAFTVLKKAGRRGLC:X3TV 
LIHKVAGALAEAGVGLEEIAKQVNWTKAKGTLGVSLSSCSVPG 
SKPTFELSADEVELGLGIHGEAGVRRIKMATADEIVKLMT^HMT 
NTTNASHVPVQPGSSVVMMVNNLGGl^FLELGIIADATVRSLEG 
RGVKIARALVGTFMSALEMPGlSLTl.I*r.VDEPLtICLIDiAETTAA 
AWPNVAAVSlTGRKRSRVAPAEPOEAPDSTAAGGSASKRMALVIi 
ERVCSTLLGLEEHLNALDRAAGDGDCGTTHSRAARAIQEMLKEG 
PPPASPAQLLSKLSVX^LLEKMGGSSGALYGLFLTAAAQPLKAKT 
I SLPAWSAAMDAGLEAMQKYGKAAPGDRTMLDSLWAAGQBL 
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SEQ 
ID 
NO: 


1 Predicted 

1 beginning 
nucleotide 
location 
corresponding 
to first 
aunino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
correaponding 
to first 
amino acid 
residue of 
araino acid 
sequence 


Amxno acid segment containing signal peptide 
(A=Alanine, C^-Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F= Phenylalanine, G^Glycine, 
HtiHiatidine, I = Isoleucine, K=:Lysine, 
L-Leucinc, M=Methionine, N=Ajsparagine, 
P«Prollne, Q=Glutamine, R«Arginine, ■ 
S -Serine, T=Threonine, V- Valine, 
j W«Tryptophan, y=Tyrosine, X»Unknown, *=Stop 
1 Codon, /=possible nucleotide deletion, 
\spos6ible nucleotide insertion) ^ 


6762 


3 


613 


ASTISWRLCVAGAEARRPVPVAGERAGGGAMWFMYLLSWIjSIiFI 
QVAPITTiAVAAGLYYI^LIEEYTVATSRIIICYMTWircrTairT rn 

LYVFERFPTSMIGVGLFTNIiVYFGLLQTFPFIMLTSPWFrLSCXS 
LVWNHYLAFQFFAEEYYPPSEVLAYFTFCLWriPFAFFVSLSA 
GBNVLPSTMQPGDDWSNYFTKGKRGK 


6763 


2 


760 


SGPDFPGRRFRGCCCVRPPAGAGMEU3GHWDMNSAPRLVSEXAE 
RKQEQKTGTEAEAADSGAVGARRPLLCLYLGGFLDLFGVSMWP 
LLSLH\nCSLGASPTVAGIVGSSYGXLOIiFSSTLVGCWSDWGRR 
SSLLACILIiSAliGYLriLGAATNVFIjFVLARVPAGIFKHTIiSISH 
ALLSDWPEKERPLVIGHFNTASGVGFILGPWGGYJ.TELEDGF 
YLTAFICPLVFItiWAGIiVWFFPRREAKPGSTE 


6764 


80 


438 


jji\.xvi'ii-f vK_i*ijj? &wiiVKt<vcrJ.ijii{j,ujNiiVUt iQUAKDFEDFR 
KKWQRTDHE]W3KX^a)LLMKAETERSAlX^VKLKHARNQV^:)VElKR 
RQRAEADCEKLERQIQLIREMLMCDTSQSIQ 


6765 


3 


550 


ARYSRVDHFCRRRCRAVARAPRFbLQPPSGPSRHFLAACVARWiT" 
RGSVIiVSEALSGSAKDGI VrEVAVGVKRGSDEI,LSGSVr.SS PWS 
ini jv3oi J V V x/u'«urivL/>^ j\.j\.r NtJiiUtU^ULjAFoRVIjHIRKLtPGEVTETE I 

VIALGLPFGKVTNILMLKGKNQAFI^ELATEEAAIT^GNYYSAVT • 
PHJLRNQ j 


6766 


1 


1287 


AVI>SliCCX)TSRSOPPVKAFXiIjISTLJfnirpf;TPVT?T T>T^ir-crwc^ * 

KFVDEOraVTVRLKEPPVDICLSKANSSSLKGFLSAMRIiAHRGCN j 
VDTPVSTLTPVKTSEFENFKTKMVITSKKDYPLSIOTPySLEHL I 
QTSYCGLVRVDMRMLCLKSLRKLDLSHNHlKKItPATIGDLlHLQ 
ELNLNDNHLBSFSVALCHSTLQKSLWSLDLSKNKIKALPVQPCQ 
I^ELKNIiKLDDNELIQFPCKIGQLIKLiRFLSAARNKLPFLPSEP? 
RNLSLEYLDIiFGNTFEQPKVLPVIKLQAPLTLLESSARTILHNR ^ 
IPYGSHIIPFHLCQDLDTAKICVOSRFCLNSFIQGTTTMNIiHSV 
AHTWLVDNJW3GTEAP I IS YFCSLGCYVNSSDI 


6767 


336 


919 


APMICLCSSDIiQFRYKEAFLRDRGLQIGYCSVDDDPRMKHFliiV 
GRLQSDNEYKKDFAKSRSQFHSSTDQPGLLQAKRSQQLASnVHY 
RQPLPQPTCDPEQLGLRHAQKAHQLQSDVKYKSDLHLTRGVGWT 
PPGSYKVEMARRAAElANARGliGLQGAYRGAEAVEAGDHQSGEV 
NPDATEILHVKKKKALLr, 


6768 . 


2 


363 


PGSTISCifLtiSEGSIiPLCMQVACGEEKHRAPTMKTLRARFKKrE 
LRLSPTDLGSCPPCGPCPIPKPAARGRRQSQDWGKSDERLLQAV 
ENNDAPRVAAIiIARKGLVPTKLDPEGKSAFHL 


676$ 


284 


396 


MSTPDFSTAEKWQEiANEVSCIiKAMLTLMLaAI^QAD 


6770 


1 


397 


QRM YQ VI WS STMAKLHD Y YKDE WKKLMTE FNYNS VMQ VPRVEK 
ITIiNMGVGEAIADKKlXDNAAADUUVISGQKPLITKARKSVAGF 
KIRQGYPIGCKVTLRGERMWEFFERLITIAVPRIRDPRGLSAKS 


6770. 


3 


3 7a 


APAGTLAMTGKSVKDVDRYQAVLANLLLEEDNKFCADCQSKGPH 
WASWNIGVFICI RCAGIHRNLGVHISRVKSVNLDQWTQEQIQCM 
QBMGNGKANRLYEAYLPETFRRPQIDPYI^FWSNLEG 


6772 


1 


1400 


aaaflqgmtvngfintvitsl\brrydlhsyqsgiiiassydiaa ■ 
clcltfvsyfggsg\hkprwlgwgr\vlmgtgslvfalphftag 

P**GWKLDAGVRTCPANPR\PVCAG\HTSGLSRYQI.VFMLGaFL 
HGVGATPIjYTLGVTYLDENVKSSCSPIYIAIFYTAAILGPAAGY 
LIGGALLNIYTEMGRRTELTTESPXiWVGAMWVGFLGSGAAAFFT 
AVPILGypRQIjPGSQRYAVMRAAEMHQt.KDSSRGEASNPDFGKT 
IRDLPLSlWLLLKNPTFILLCLAGATEATIillPGMSTFSPKFLES 
QFSLS AS EAATLFGYLWPAGGGGTFLGGFPVNKLRLRGSAVr K 
FCI>FCTVVSLLGILVFSUlCPSVPMAGVTASyGGSLLPEGHI*NL 
TAPCNAACS CQPEHYSPVCGS DGLMYFSLCHAGCPAATETNVDG 
OKVYRDCS Cr PQNLSSGFGHATAGKCTST 



549 



A 
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SEQ 
ID 
NO: 

6773 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal p^pO:^ 
{A=,Alanine, C^Cysteine, C-Aspartic Acid, E= 
2 Ji^^r^^.^''^'^' ^=P^«^ylalanine, C^Glycine, 
H-Histidxne, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N«Asparagine 
P=Proline, Q=Glutamine, R^Arginine, ' ^ 
StrSerine, T=Threonine, V«Valine, 
W»Tryptophan, Y=Tyrosine, X«=Un)cAown, *=.Stop 
Codon, /-possible nucleotide deletion, * 
\=possibXe nucleotide insertion) ' ^ 


6774 


a 


630 


PWEAPK£HKYKAEEHTVVLTV-I^EPCHFPFQYHRQLYHKet-|nc^ 

RPGPQPWCATTPNFDQDQRWGYCLEPKKVKDHCSKHSPCQKGGT 

CVNMPSGPHCLCPQHLTGNKCQKEKCFEPQLLRFPHKNEIWYRT 

EQAAVARCQCKGPDAHCQRLASQACRTNPCLHGCRCLEVEGHRL 

CHCPVGYTG?PCDVGE*GSGASRRPAPRWDGIAR 


6775 


X46 


389 


i.TEX,SDQQ>fVLFFII.SS/WVPTFI^MDVDGRVIKADSFSklISS" 
GLRIGFLTGPKPLIERVILKIQVSTLHPSTFNQLMISQ 


G776 " 


104 


614 


ICPSQLRVLTARGGRKAPSPQLWTLVUILIEEKWRSHRILRMNS 
GRPETMENLPALyTIFQGEVAMVTDYGAFIKIPGCRKQGL\7HRT 
HMSSCRVDKPSEIVDVGDKVWVKLIGREMKNDRIKVSLSMKVVN 
QGTGKDLDPNNV\SLSKKRGGGDPSRlTLGRRSPLRt.S 


^111 


3 


1X08 


HERHERHEGAIjSQDAI.LRI£ I PLDSNMRPEKCRRFVHPQWOI.T.M 
\ I'NGTFP^TSDADMEPCVDGWVYDRISFSSTIVTEWDLVCDSQSL 
: TSVAKFVFMAGMMVGGILGGHLSDRFGRRFVLRWCYLQVAIVGT 
CAAlAP-rPLIYCSLRPLSGlAAMSLlTOTIMLIAEWATHRPOAM 
G I TLGMCPSGI AFMTLAGLAFAI RDWHI LQ1.WSVP YFVI FLTS 
SWLLESARWLirNNKPEEGLKELRKAAHRSGMKNARDTIiTLEIL 
KSTMKKELEAAQKKKPFLGERLHMPNICKRISLLPFTKFANFKA 
YFGLNIxHG/XjKHLGNNVFLLQTLFGAV/TPPGQLVLHI/SHWGSG 
RVS S RGRVNCIiGLFVLQVW 




779 


63 


CFFHGPAWRDCEVRATFAKKQGQ6GIISCXAfc'SPAQPLYACX5SY~ 

GRSLGLYAWDDGSPIALLGGHQGGITHLCPHPrxSNRPFSGARKD 

AELLCWDLRQSGYPLWSLGREVTTNQRIYFDLDPTGQFLVSGST 

SGAVSVWDTDGPGNDGKPEPVLSFLPQKDCfTNGVSLHPSLPLLG 

HCLPVSVCFI.SPTESGGRRRGAGPSU5SPRRHVHLECRLOLWWC 

GGGARLQHP* *SPRARKGR 


6778 
6779 


3iX 


BOS 


iqsitdesrgsirrknpantrlrlnvp\ebtagdse/erspeeh"" 

VQADPRIRSASPKCPTSSPFPKGRSPEGBGET\DPHKVHFHPGP 
K0KSVAEKN\KGP\SPVSSEGIKDFPSMKPEWENI^QSWVRRMH 
TNAVRLNEVIVKKSRDAKLVLLNMPGPPRNRNGDENY 


67S0 


2 


53S 


RALRRQPRLl^AANGIEPESmiSEPIKGSRKPCVNKEEIALKKP 

makcawkgpreppqdaraeaespggasesdqdgghesppkkkav 

AWVSAKNPAPKRKKKKVSLGPVSYVLVDSEDGRKKPVMPKKGPG 

srreasdqkaprgqqpaeatastsrgpkakpegsprratnesrk 


€731 


3 


403 


HbVNDNfKPEXNXNLKSPGKEEISYI?EGDPIDTFVALVRVQDKD 
SGmGElVCKLHGHGHPKr^KTYENKYLILTNArLDREKRSEYS 
LTVIAEDRGTPSr^STVKHFTVQXNDINDNPPHFQRSRYEFVISE 


6782 


1 


1269 


APTRPVFPTLgULiSSSKEPSNSLNJLPHSNELCSSLVHPELSEVS 
SNVAPSIPPVMSRPVSSSSjCSTPLPPNQITVFVTSNPITTSANT 
SAALPTHI^SALMSTVVTMPNAGSKVMVSEGQSAAQSNARPOFl 
TPVFINSSS I IQVMKPSOPSTIPAAPLTTNSGLMPPSVAWGPL 
HIPQNIKFSSAPVPPNALSSSPAPNtQTGRPLVLSSRATPVQLP 
SPPCTSSPWPSHPPVQQVKELNP0EASPQVWTSADQNTI.PSSQ 
STTMVSPLLTWSPGSSGNRRSPVSSSKGKGKVDKIGQIX.LTKAC 
KKVTG5LEKGEEQYGADGETEGQGX>DTTAPGLMGTEQLSTELDS 
KTPTPPAPTLLKI^TSSPVGPGTASAGPSLPGGAiPTSVRSIVTr 
LVPSEL I SAVPTTKSNHGG X ASESLAG 




3 


X327 


RKPTVIRIPAKPGKCLHEDPQSPPPLPAEKPIGNTFSTVSGKLS 
NVERTRNLESNHPGQTGGFVRVPPRLPPRPVNGKTXPTQQPPTK 
VPPERPPPPKLSATRRSNKKLPFimSSSDMDLQKKQSNXATGIiS 
KAKSQVFKNQDPVLPPRPKPGHPbYSKYMLSVPHGXANEDlVSQ 
^PGELSCKRGDVLVMLKQTENNYXiECQKGEDTGRVHLSQMKLXT 
Pr»DEHLRSRPNPFSPPKAPSf£AQKPVr>SGAPHAVVLHDFPAEOV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acxa segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid E- 
Glutaraic Acid, F=Phenylalanine, G=Glycine 
H=Histidine. I=Isoleucine, K-Lysine 
L=Leucine, M^Methionine, K^Asparagine 
P==Proline, Q==GlutaminG, R^Arginine, ' . 
S=Serine, T=Threonine, V=Valine, 
W-Tryptophan, Y^Tyrooine, X-Unknovn, *^Stop 
Codon, /=possible nucleotide deletion, ^ 
\=possible nucleotide insertion) 






DDmi,TSGEIVTfLL£KIDTDWYRGNCRNQIGIFPANYVKVIIDI 
PEGGNGKRECVSSHCVKGSRCVARFEYIGEQKDELSFSEGEIII 
LKEYVNEE WARGE VRGRTG I PPLNFVE PVEDYPTSGANVLSTKV 
PLKTKKEDSGSNSQVNSLPAEWCEALHSFTAETSDDLSFKRGDR 


6783 3 


1750 


S YHHHHAQUbAAAS PNLTASQKTVTTTSMITTKTLPLVIiKAATA 
TMPASWGQRPTIAMVTAINSQKAVLSTDVQNTPVNLQTSSKVr 
GPGAEAVQIVAKNTVTLQVQATPPQPIKVPQFIPPPRLTPRPNF 
LPQVRPlCPVAQNNIPlAPAPPPMLAAPQLrQRPVMLTKFTPTTI. 
PTSQNSIHPVRWNGQTATIAKXFPMAQX^TSIVIATPGTRLAGP 
QTVQLSKPSLEKQTVKSHTETDEKQTESRTITPPAAPKPKREEN 
PQKLAPMVSLGLVTHDHLEEIQSKRQERKRRTTAKPVYSGAVPE 
PERKKSAVTYLNSTMHPGTRKRGRPPKYNAVLGPGALTPTSPOS 
SHPDSPEKTEKTETTFTFPAPVQPVSIjPSPTSTDGDIHEDFCSVC 
RKSGQI^bMCDTCSRVYHLDCLDPPLKTIPKGMWlCPRCQDQMLK 
KEEAIPWPGTLAIVHSYIAYKAAKEEEKQKLLKWSSDLKQEREQ 
LEQKVKQLSNSISKCMEMKNTILARQKEMHSSLEKVKQLIRLIH 
GIDLSKPVDSEATVGAISNGPDCTPPANAATSTPAPSPSSQSCT 


6784 3 

• 


17S0 


SYHHHHACK3SAAASPKLTASQKTVTTTSMITTKTLPLVLKAATA 

TMPASWGQRPTrAMVTAINSQKAVLSTDVQNTPVNLOTSSKVT 

GPGAEAVQIVAKNTVTLQVQATPPQPIKVPQFIPPPRLTPRPNF 

LPQVRPKPVAQNWIPIAPAPPPMIAAPQLIQRPVMLTKFTPTTL 

PTSQNSIHPVRWNGQTATIJUCTPPMAQLTSIVIATPGTRLAGP 

QWQLSKPSLEKQTVKSHTETDBKQTESRTITPPAAPiCPKREEN 

PQiCLAFMVSt^LVTHDHLEEIQSKRQERKRRTTANPVYSGAVPE ^ 

PERKKSAVTYLNSTMHPGTRKRGRPPKYNAVLGPGALTPTSPQf ' 

SHPDSPE^FEKTETTPTFPAPVQPVSLPSP^STDGDIHEDFCSVC 

RKSGQLLMCDTCSRVYHLDCLDPPLKTIPKGMWICPRCQDONI^K 

KEEAIPWPGTLAIVHSYIAYKAAKEEEKQKLLKWSSDLKQEREQ 

LEQKVKQLSNSISKCMEMKWTIIARQKEMHSSI>EKVKQbIRi:.IH 

GIDIiSKPVDSEATVGAISNGPDCrPPAMAATSTPAPSPSSQSCT 
ANCNQGEETK 


678S 1 


528 


l.GWWIaHYCSMYSKPECLKI,LLRSKPTVDIVNQAGETAIJDXAkR 

KPSPVKKERSPRPQSFCHSSSrSPQDKLALPGPSTPRDKQRLSY 
GAFTNQIFVSTSTDSPTSPTTEAPPLPPRNAGKGPTGPPITPHR 


6786 X820 

2646 


1397 


'■^'^■^ ■'-f^ *-*f**r ±x\.izti.u\v^nv^tLLje r\Uj. \ xJRiUjXVARFyGGXSYQSO 
INHIRNGIDXLVGTPGRIKDHLQSGRLDLSKLRHWLDEVDQMI, 
DLGPAEQVEDI IHES YKTDSEDNPQTLIiPS ATCpQWVYTVA\KK 
YMKSRYEQTOLDGKMTQKAATTVEHIiAIQCHWSQRPAVlGDVLQ 
VYSGSEGRAIIFCETKKNVTEMAMMPHXKQNAQCLHGDrAQSQR 
EIXLKGFREGSPKVLVATNVAARGLDIPEVDLVIQSSPPQDVES 
YIHRSGRXGRAGRTGICICFYQPRERGQLRYVEQKAGIXFKRVG 
VPSXMDLVKSKSMDAIRSLASVSYAAVDPFRPSAQRI.IBEKGAV 
DALAAALAHISGASSFEPRSLIXSDKGFVTMTLESLEEIQDVSC 
AWKELNRKLSSNAVSQITRMCIiIiKGNMGVCFDVPTTESE£U;QAE 
WHDSDMILSVPAKLPEIEEYYDGNTSSNSRQRSGWSSGRSGRSG 
RSGGRSGGRSGRQSRQGSRSGSRQDGRRRSGNRNRSRSGGHKRS 
FD^VPYHLVDFLSDFIiVDSVYliTGRQIDHXiXGLXGLIDHLTSHS 

svm 




2270 


PSSFPKKVPLEELEEPPK*KRSGIX3SLXPKSQIQNGP*PQXFPP 
FELGSPSGVrSAHCNLRLLGSSDSPAPASRVAGIlGTCHHAWLI 
LVFLVEMGFHHVGQAGLKLLTL\VIHPPWPPKVLGLQT 


6788 16 


936 ( 


3GTVDLR\DMIAVSVXAAVRGGR/AXVRRVRBSNVXiHEKSKGiCX 
^EGAEDKMXSGDVLSNRKMFYLI,KXAFPSVQrKXEEHVD\ELDO 



f 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 

amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptiSi" 
(A=Alanine, C^Cysteine, D=Aspartic Acid, 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidiue, I =-rso leucine, K=Lysine, 
L-Leucine, M=Methionine, N=Asparagine, 
Pr=Proline, Q=Gluta(nine, R^Arginiue, 
S=:Serine, T== Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *-Stop 
Codon, /=po83ible nucleotide deletion, 
\*=pos3ible nucleotide insertion) ^ 








EVIIiVJGS * DS *GYPKGK* LLPKEVPSR/RVliliSGIiTPLDATQEV" 

FTEDIiSK\YVTTMVCVAVNGKPMLGVIHKi>FSEyTAMAMVDGGS 

NVKAJiS SYNE KTPRI WSRSHSGMVKQVALQTPGNQTTI I PAGG 

AGYKVliALLDVPDKSQBKADLYXHVTYIKKWDlCAGNAILKALG 

GHMTTLSGEElSYTGSDGlEGGi:j::ASIRMNHQAl,VRKIiPDLEKT 

GHK 


67B9 


2 


678 


GNGINVLKIAPESAIKFMAY£QIiaaiVW**PGDS*GF/'YERI,VA 
GSI^AGAIAQSSIYPMEVLKTRMALIRKTGQYSGMLDCARRIIARE 
GVAAFYKGYVPHMLGIIPYAGIDLAVYETLKNAWLQHYAVNSAD 
PGUPVLLACGTMSSTCCQLASYPLALVRTRMQAQASrEGAPEVT 

MSSLFKHII.RTEGAFGriyRGLAPWFMKVIPAVSISYVVYENLKr 
TLGVQSR 


6790 


2 


406B 


APPAGRRRMQAAPRAGCGAAIiliLW I VSS CLCRAWTAPSTSQKCD 
EPtiVSGLPHVAFSSSSSlSGSYSPGYAKINKRGGAGGWSpSDSD 
HYQWLQVPFGNRKQISAIATQGRYSSSDWVTQYRiyiLYSDTGRNM 
KPYHQDGNIWAFPGNINSDGWRHELQHPIXARYVRIVPLDWNG 
EGRIGLRI EVYGCSYWADVINPDGHVVLPyRPWJKKMKTLKDVI 
ALNFKTSESEGVILHGEGQQGPYITLBLKKAKLVI*SIiNLGSNQL 
GPIYGHTSVMTGSIiLDDHHWHSWIERQGRSINI^TLDRSMQHFR 
TNGEFDY LDIiDY E I TFGGIPPSGKPSSSSRKNFKGCMES INYNG 
VNITDLARRKKLEPSNVGNLSFSCVEPYTVPVFFWATSYLEVPG 
RLNQDLFSVSFQFRTWNPNGLLVFSHFADNIXSNVElDIiTESKVG 
VHINITQTKMSQ ID ISSGSGLNDGQWHEVRFIiAKENFAILTIDG 
DEASAVRTNSPLQVKTGEKYFFGGFLNQMNNSSHSVLQPSFQGC 
MQI*IQVDDQLVNI,YEVAQRKPGSFAKfVS IDMCAl IDRCVPNHCE 
HGGKCSQTWDSPKCTCDETGySGATCHNS tYEPSCEAYKHLGQT 
SNYYWIDPDGSGPLGPLKVYCWMTEDKVWTIVSHDLQMQTPWG 
YNPEKYSVTQl^VYSASMDQISAITDSAEYCEQYVSYFCKMSRLL 
NTPIX3SPYt>?VArGK^UreKHYYWGGSGPGIQKCACGlERNCTDPK 
YYCNCDADYKQWRKDAGFLSYKDHLPVSQWVGDTDRQGSEAIOU 
SVG PLRCQGDRNYWNAASFPNPSSYtiHPSTFQGETSADISPYPK 
TLTPWGVPIiENMGKEDFIKLELKSATEVSFSFDVGNGPVEIWR 
SPTPLNDDQWHRVTAERNVKQASLQVDRLPQQIRKAPTEGHTRL 
ELYSQLPVGGAGGQQGFtGCIRSLRMNGVTIjPLEERAKVTSOFI 
SGCSGHCTS YGTNCENGGKCLERYHGYS CDCSNTAYDGTFCNKD 
VGAFFEEGMWLRYNFQAPATNZVRDSSSRVDNAPDQQNSHPDIiAQ 
EEIRFSFSTTKAPCILLYISSFTTDPLAVLVKPTGSLQIRYNIiG 
GrREPYWIDVDHRNMANGQPHSVWirRHEKTrPI,Ki:,DHyPSVSy 
HLPSSSDTLFNSPKSLFLGKVIETGKIDQEIHKYNTPGFTGCLS 
RVQFNQlAPLKAALRQTKASAHVHiQGELVESNCGASPIiTLSPM 
SSATDPWHLDHDDSASADPPYNPGQGQAIRNGVNRNSAI IGGVI 
A\ WI FTPSLGTP \VLP * SR*HVS PttKGTLP I PNEAKGAGSRQK 
KPGRRPSMNNDPPTSQRPtDESKKEWPiiriRGGYLAMG 




6791 


1801 


1193 


TGHEGAKGEKGDKGDIASPRGERGQHGPKGBKGYPGIPPEL/PGW 
SAW* SWIiTAASTKVQAILLPQPLE*IiGLQIAFMASIATHFSNQ 
NSGIIFSSVETNIGNFFDVMTGRFGAPVSGVYFFTFSMMKHEDV 
EEVYVYIJ«1HNGWTVPSMYSYEMKGKSDTSSmiAVL,KLAKGDEVW 
LRMGNGAIiHGDHQRFSTPAGFLLFKTK 




6792 


33 


1073 


VRIlTNWGVDMYIiFSLGSESPKGAIQHlVSTEKTILAVERNKVLL 
PPLWNRTFSWGFDbFSCCLGSYGSDKVLhSTFENLAAWGRCIiCAV 
CPSPTTIVTSGTSTWCVWELSMTKGRPRGIiRIiRQALYGHTQAV 
TCLAASVTFSIitjVSGSQDCTCILWDLDHLTHVTRLPAHREGISA 
ITISDVSGTIVSqAGAHLSLWNVNGQPLASITTAWGPEGAITCC 
CLMEGPAWDTSQIIITGSQDGMVRVWKT/VGCEDVCSWTASRRG 
APGSASKPKRPQVGEEPGLESRAGR*HCFDREAQQNQP\PVTAL 
AVSRNHTKIiLVGDERGRIFCWSADG*EERGSRGSGTTVPG 



552 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aapartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Histidine, I^Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=: Proline, Q-Glut amine, R=Arginine, 
S=Serine, T=Threonine, v=Valine, 
W=Tryptophan, Y^Tyrosine, X«Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\!=possible nucleotide insertion) ^ 


6793 


2340 


805 


GRKEANY XYGSLTQAGTVSLGLDAEGQEVFVPFSAVIjPMVAPND 
LVFDGWDISSLNLAEAMRRAKVLDWGtiQEQLWPHMEALRPRPSV 
YIPEFIAANQSARADNLIPGSRAQQLEQIRRDIRDFjRSSAGLDK 
VIVIiWTANTERFCEVIPGLNDTAENLLRTIKLGLEVSPSTLFAV 
ASILEGCAFLNGSPQNTIiVPGALEIiAWQHRVFVGGDDFKSGQTK 
VKSVLVDFIiIGSGIiKTMSIVSYNHLGWNDGENLSAPr,QPRSKEV 
SKSNWDDMVQSNPVLYTPGEEPDHCWIKYVPYVGDSKRAIiDE 
YTSELNU^TNTLVLHNTC3DSLIiAAPIMLDIiAI*LTELCQRVSF 
CTDMDPEPQTFHpVLSLbSFtiFKAPLVPPGSPWNALFRQRSCI 
ENILRACVGLPPQNHMIiliEHKMERPGPSIiKRVGPVAATYPMLNK 
KGPVPAATNGCTGDANGHLQEEPPMPTT*GPGHTVSRIiFLPAAP 
HDPTIiKAPTNKGRCMFSPPSTWGSWGL 


6794 


169 


1349 


DDVKRKPEASAH*EKPGPPSRPGVRGGRERAGGRGSHGARSCR\ 
EPAPPAPAPPEDHPDEEMGFTIDIKSFLKPGEKTYTQRCRLFVG 
NLPTDITEEDPKRLPERYGEPSEVFlNRDRGFGFXRIiESRTLAE 
lAKAELDGTILKSRPLRI RFATHGAALTVKNLS P WSNELLEQA 
FSQFGPVEKAVWVDDRGRATGKGFVEFAAKPPARKAI.ERCX3DG 
AFLIiTTTPRPVIVEPMEQFDDEDGLPEKLMQKTQQYHKEREQPP 
RFAQPGTFEFEYASRWKALDEMEKQQREQVDRNIREIAKEKLEAE 
MEAARHEHQLMLMRQDLMRRQEEUlRI^EELRNQELQKRKQIQt^R 
HEEEHRRREEEMIRHREQEELRRQQEGFKPNYMENYVCHFLR 


679S 


1740 


1010 


GPRRQTQVRDlIEIiDSF* DWAAQETDCAQNSGERl, * KGV/ liENFS 
TMSKSAVKISLDL1.SNPLCEQI5QDLLNMVTALDTAMKRMDAFNQ 
EKVWQIQKTVIEPLKKFGSVFPSLNWAVKRREQALQDYRRLQAK 
VEKYEEKSKTGPVLAKLHQAREEIiRPVREDFEAKNRQLLEEMPR 
FYGSRLDYFQPSFESIilRAQWYYSEMHKIFGDLSHQLDQPGHS 
DEQRERENEAKLSELRTVLS IVADD 


6796 


48 


683 


GKEIQIPriKIAWX4liFGI.E*PVGALGKGWSP**SHVALGQIiGW 
LTRAVRSSWRWELCVSAQEWSQRSA* SS PSP VGACPSLNPPET 
SVQEGRDCWQR*LPRI>FSALVGQPGCWPQGAPPERCV*PGRCKW 
HLQSQVLR*ERRRCCRCIiPRFA*GWRRRHQRLGLGIHPAPLGST 
SPPHPEGNSCX}CRR*GWAAEIiRLPSSWI.*GKLGC* 


6797 


1620 


211 


TERMTPSQPTRGSSCTRPSSMLWTSTWRCLTCHWAGMRMSWGV 
TLGPMAQGLLSASGTTTEATWTRPTTHLTLIRWWLLTASRVDPP 

erpppppsddlti»lessssyknl/daqipq/dv?smspstsg*rp 
ltsrass imrsrtai psas *s rlttkhtvggsps awrprptsrs 
vstpvssstettasgscltwwsss papcpsssapahs feascck 

TSIjWGSCGGSGDGSSACGSGWNLSMAGTSCSSPAMCSPSRAPS* 
RSASRPRTWRATTSAASSWAPRRCWCGWA*SAT*PSSTTTISSS 
PHCGWP CPASCAS AAAWLSS7*WATAS VAGSCWGP IM* SSAHSPW 
CI>SACSRSSMGTTCL*RSPP\SGASRAAAAWCX3SSPSSTFTPSS 
ASSSTWCS ASSSRSS PAPTTPSS IPAAQAQRRAS CRPTSHSART 
APPPAS SAAGAARPAAFSAAAEGTPRRS IRCW 


6798 


3894 


1696 


STISWESLESWI/NKATNPSNRQEDWEYIlGFCDQINKEIiEG*VS 
ALWGQLRGSGLGRGTTMAKEGQPGSPRI^ALECVIJbVPQ\PQIA 
VRLIiAHKIQSPQEWEALQAIiTYLGDRVSEKVKTKVIEIiLYSWTM 
ALPEEAKIKDAYHMLKRQGIVQSDPPIPVDRTLIPSPPPRPKNP 
VFDDEEKSKLLAKLLKSKNPDDLQEANKLIKSMVREDEARIQKV 
TKRLHTLEEVNNNVRLI^SEMIjLHYSQEDSSDGDREWIKELFDQC 
ENKRRTLFKLASETEDNDNSLGD I IiQASDNIjSRVINS YKT IIEG 
QVXNCEVATIiTLPDSEGNSQCSNQGTLIDIiAELDTTNSriSSVIiA 
PAPTPPSSGIPILPPPPQASGPPRSRSSSQAEATIiGPSSTSNAL 
SWLDEELLCLGIADPAPNVPPKESAGNSQWHliLQREQSDU3FFS 
PRPGTAACGASDAPLLQPSAPSSSSSQAPLPPPFPAPWPASVP 
APSAGSSIiFSTGVAPAlAPKVEPAVPGHHGLALGNSALHHLDAL 
BQLLEEAKVTSGLVKPTTSPLIPTTTPARPIiI.PFSTGPGSPLFQ 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re spon d ing 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal pepti^^ 
{A=Alanine, C=Cysteine, D-Aspartic Acid, E=i 
Glutamic Acid, F^Phenylalanine, G-Glycine, 
H^Histidine, I=lsoleucine, K^bysine, 
L=Leucine, M=Methionine, N^Asparagine , 
P-Proline, Q=Glutamine, R=Arginine, 
S= Serine, T=Threonine, V=Valine, 
WcTryptophan, Y^Tyrosine, X«Unknown, -'^.Stoji 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) v 








PLSFQSQGS PPKGPELSIiAS IHVPLESIKPSSALPVTAYDlOSfGF 
RILFHFAKECPPGRPOVLVWVSMLNTAPLPVKSIVLQAAVPKS 
MKVKLOPPSGTELSPFSPIQPPAAITQVMLLANPLKEKVRLRyK 
LTFALGEOIiSTEVGEVDQFPPVEQWGWL 


679$ 


3894 


1696 


STISWESLESWLNKATNPSNRQEDWEyiIGFCDQINKBLEG*VS 
ALWGQLRGSGLGRGTTM7UCEGQPGSPRLSALECVIiliVPQ\pQlA 
^TlLr^KIQSPQEWEALQAIiTYIjGDRVSEKVKXICVIELLYSWTM 
ALPEEAKIKDAYHMIiKRQGIVQSDPPIPVDRTLlPSPPPRPKNP 
VFDDEEKS KLLAKLLKSKNPDDLQEANKL IKSMVREDEARIQKV 
TKKLHTLEEVNNNVRLLSEMLLHYSQEDSSDGDRKI^IKELPDQC 
ENKRRTLFKLASETEDNDNSLGDILQASDNLSRVINSYKTIIEG 
QVINGBVATLTLPDSEGNSQCSN<Kni*IDlJ^LDTTMSI/SSVIA 
PAPTPPSSGIPILPPPPQASGPPRSRSSSQAEATIiaPSSTSNAli 
SWLDEELLCliGIiADPAPNVPPKESAGNSQWHLIiQREQSDr^FPS 
PRPGTAACGASDAPIiLQPSAPSSSSSQAPLPPPFPAPWPASVP 
APSAGSSLFSTGVAPAIAPKVEPAVPGHHGLALGNSAbHHLDAL 
DQLLEEAKVTSGLVKPTTSPIiI PTTTPARPLL.PFSTGPGS PLPQ 
PLSFQSQGSPPKGPELSLASIHVPLESIKPSSALPVTAYDKNGF 
Rri:,FHPAKECPPGRPDVT.VVWSMLNTAPLPVKSIVLQAAVPKS 
MKVKLQP PSGTELSP PS PIQPPAAITQVMIiIiANPLKEKVRLRYK 
LTFALQEQLSTEVGEVDQFPPVEQWGNL 


G600 


404 


1646 


RRSPSTGIiSPVPQPSSPSLSDYSIPWSIiLLSGTlAWATPGK*AG 
* PQAW * LGLAPAIAF I /GLTRGRKQNKEKMAEGGSGDVDDAGDC 
SGARYNDWSDDDDDSNESKS IVWYPPWARIGTEAGTRARARARA 
RATRARRAVQKRASPNSDDTVLSPQELQKVLCLVEMSEKPYILE 
AAIilALGNNAAYAFNRDIIRDLGGLPIVAKIIOTRDPlVKEKAL 
IVIJJNI^VNAENQRRLKVyMNQVCSDDTITSRtiNSSVQLAGLRlk. 
TNMTVTNEYQHMLANS I SDFPRLPSAGNEETKLQVIiKLIiIiNIAE 
NPAMTRELLRAQVPSSriGNSLFKKKENKEVILKLLVIFENINDN 
FKWEEKEPTQNOFGEGSLFFFLKEFQVCADKVLGIESHHDFIjVK 
VKVGKFMAKIiAEHMFPKSQE 


6801 


2 


X755 


SAEEFESQQAS VTMHDVDAES FEVLVDYCYTGRVSLSEANVERL 
YAASDMLQLEYVREACASFLARRLDL'TNCTAILKFMAFGHRKL 
RSQAQS Y I AQNFKQLSHMGS I REETriAI)I.TI*AQLrAVLRIX>SLD 
VESEQTVCHVAVQWIiEAAPKERGPSAAEVFKCVRWMHPTEEDQD 
YIiEGIiLTKPIVKKYCLDVIEGALQMRYGDLl»YKSr.VPVPNSSSS 
/R*-QQQLSCICSRKSTPBTGYVCQGDGDIiLWTPQRSliS\RYDPY 
I IMro t^AJXor/Ull V JLooAVW VtolrlJJn^ 

VYKJ>AQNSWQQLADRLLCREGMDVAYIiNGYIYILGGRDPr'n3VK 
LKEVECYSVQRNQWALVAPVPHSFYSFELIWQNYLYAVMSKRM 
LCYDPSHNMWLNCASLKRSDFQEACVFNDE I YCICDI PVMKVYN 
PARGEWURI SNI PLDSETHNYQ IVNHlXiKLLLITSTTPQWKKNR 
VTVYEYDTREDQWINIGTMLGLLQFDSGFICLCARVYPSCLEPG 
QSFlTEEDDARSESSTEWDIiDGFSELDSESGSSSSFSDDEVWVQ 
VAPQRNAQDQQGSL 


€802 


1S7 


1341 


ETFPLFFFIiLSKTPGKTASMAHFVQGTSRMIAAESSTEHKEO^ ' 
PSTRKNLMWSLEQKIRCLEKQRKELLEVNQQWDQQPjRSMKELYE 
RKVAELKTKLDAAERFLSTRERDPHQRQRKDDRQREDDRQROLT 
RDRLQREEKEKERLNEELHELKEEinCLLKGKiTrrANKEKEHyEC 
EIKRLNKALQDAIiNIKCSFSEDCLRKSRVEFCHEEMRTEMEVLK 
QQVQIYEEDFKKERSDRERLNQEKEELQQINETSQSQLMRLNSQ 
IKACQMEKBKLEKQLKQMYCPPCNCGLVFHLQDPWVPTGPGAVQ 
KQREHPPDYQWYALDQLPPDVQHKAN/0WCLAPPPVCCQAG/PR 
TPGLK*SSCI,WLPKC*MPRFILSKESPSVEVHTNRERQQATRER 
G 


6803 


1 


2203 


KX.SGRPYRHMQVLGTSKLYDIRKTIFTFTPQFIDQQQFYriALDN 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue o£ 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acxd segment containing signal peptlHi" 
<A-Alanine, C=Cysteine, D^Aspartic Acid, 
Glutamic Acid, r=Phenylalanine, G^Glycine, 
H^Histidine, I==Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=rAsparagine, 
P=Proline, Q=Glutamine, R-Arginine, 
S ^Serine, T=Threonine, V=.Valine, 
""Tryptophan, Y=Tyro3ine, X-0nknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=pos3ible nucleotide insertion) > 








KMIVEMLRTDLSYLCSRWRMTGUPTXTFPISHSMLDEbtitSLNS 
S ILiAAIJiKMQDGYPGGARVQTGKLSEFLTTSCCTHLSFMDPGPE 
GKLYSEDYDDNYDYLESGNWMNDYDSTSHARCGDEVARYLDHLL 
AHTAPHPKLAPTSQKGGLDRFQAAVQTTCDLMSLVTKAKELHVQ 
NVHMYIiPTKLFQASRPSFNLLDSPHPRQENQVPSVRVEIHLPRD 
QSGEVDFKALVUSLKETSSLQEQADILYMbYTMKGPDWNTELYN 
ERSATVRELLTEriYGKVGEIRHWGLIRYlSGILRKKVEALDEAC 
TDLLSHQKHLTVGLPPEPREKTISAPLPYEALTQLIDEASEGDM 
SISILTQEIMVYLAMYMRTQPGLFAEMFRLRIGLIIQVMATEIA 
HSLRCSAEEATEGLMNLSPSAMKNLLHHILSGKEFGVERK/SVR 
PTDSNVSPAISIHEIGAVGATKTER'TOlMQt.KSEIKQVEPRRLS 
ISAESQSPGTSMTPSSGSFPSAYDQQSSKDSRQGQWQRRRRLDG 
ALNRVPVGFYQKVWKVLQKCHGliSVEGFVLPSSTTREMTPGEIK 
FSVHVES\VLNVI>LRPEYRQrj.VEAILVLTMI*ADIEIHSIGSII 
AVEKIVHXA1TOLFI^EQKTLGP\DDTMLAKDPASG\1CTLR\YI3 
SAPSGRFGTMTYLS \RAA\ATY VQEFLP\HS I CAMQ 


6804 


1 


9S1 


GSPGKKEEKAKNKESLCMENSSNSSSDEDEEETKAKMTPTKKYW^ 

GLEEKRKStRTTGFYSGFSEVAEKRIKLLNNSDERLQNSRAKDR 

KDVWSSIQGQWPKKTLKELFS0SDTEAAASPPHPAPEEGVAEES 

LQTVAEEESCSPSVEIiEKPPPVNVDSKPlEEKTVEVNDRKAEFP 

SSGSNPS A* I PLP YliHLNRliHQSL * QKGSRQQSSVTVSEPLAPN 

QBEVRSIKSETD$TIEVOSVAGELQDU}SERE*tiASRF*CQCfc:L 

KQ**SARTRTS*KSLYRSEKSERCSGRRKriKKAEKKP*SMSGK 
QQKEGKRHK 


6805 


1539 


206 


KyPDI^KYFGKSFDVSVSESSSLLSNDLPKFADGIKfiJWiiKQNYI. 

VPSPVUIILDHTAFSTEKSADIVICDEECDSPESVNQQTQEESE 

lEVHTAEDVPIAVEVHAISEDYDIETENKSSESLQDQTDEEPPA*^- 

KLCKILDKSQALNVTAOQKWPLLRANSSGLYKCELCEFNSKYFS 

DLKQHMILKKKRTDSNVCRVCKESFSTNMLLlEHAKIiHEEDPYI 

CKYCDYKTVIFENLSQHIADTHFSDHXiYWCEQCDVQPSSSSBIiY 

LHFQEHSCDEQYLCQFCEHETKDPEDLHSHWNBflftCKLrELSD 

KYlWGEHGQYSLIiSKITFDKCKNFFVCQVCGFRSRUiTlfVNRHV 

AIEHTKIFPHVCDDCGKGFSSMLE\lAKHLbfSHLSEGIYI^QYW 

EYSTGQIEDLKIHr.DFKHSADLPHKCSDCI»MRFGNERELISHI,P 
VHETT 


6806 


272 


3 794 


VALCFPNSDPVMFMDAFYGCLLAELGPVPIEVPLTRKDAGSQQV " 

GFI.LGSCGVFIALTTDACQKGLPKAQTGEVAAFKGWPPLSWX,VI 

DGKHIAKPPKDWHPXiAQDTGTGTAYIEYICrSKEGSTVGVTVSHA 

SLIAQCRALTQACGYSEAETLTWVLDFKRDAGLWHGVLTSVMNR 

MHWSVPYALMKANPLSWIQKVCFYKARAALvksRDMHWSLLAQ 

RGQRDVSLSSLRMLIVADGANPWSISSCDAFLNVFQSRGIiRPEV 

ICPCASSPEALTVAIRRPFDLGGPPPRKAVLSMNGLSYGVIRVD 

TEEKIiSVLTVQDVGQVMPGANVCWKIiEGTPYLCKrDEVGEICV 

SSSATGTAYYGIiljGITKNVFEAVPVTTGGAPIFDRPFTRTGLIiG 

FIGPDHLVPIVGKLDGLMVTGVRRHNADDWATALAVEPMKFVY 

RGRIAVFSVTVLHDDRIVLVAEQRPDASBEDSFQWMSRVLQAID 

SIHQVGVYCLALVPANTLPKAPLGGIHISETKQRFI,EGTLHPCN 

VLMCPHTCVTNLPKPRQKQPEVGPASMIVGNIiVAGKRIAQASGR 

EIiAHIiEDSDQARKFLFLADVIiQWRAHTTPDHPliFI;IiI,NAKGTVT 

STATCVQIiHKRAERVAAALMEKGRLSVGDHVALVYPPGVDLIAA 

FYGCLYCGCVPVTVRPPHPQNIiGTTLPTVKMIVEVSKSACVLTT 

3AVTRLLRSKEAAAAVDIRTWPTILDTDDIPKKKIASVFRPPSP 

DVLAYLDPSVSTTG ILAG VKMSHAATS AliCRS IKLQCELYPSRQ 

rAICIiDPYCGLGPALWCLCSVYSGHQSVLVPPLELESKVSLWLS 

WSQYKARVTFCCYSVMEMCTKGLGAQTGVLRMKGVWLSCVRTC 

WVAEERP\riaLTQSFSKIiFKDLGLPARAVSTTFGCRVNVAIC 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

corresponding 

UO i.J.J^£>U 

amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C^Cysteine, D^Aspartic Acid, E« 
Glutamic Acid, F=Pbenylalanine, G=Glycine, 
H=Histidine, 1-Isoleucine, K=Lysine, 
li^Leucine , M-Me thionine , N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine^ 
W=Tryptophan, Y«Tyrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) ^ 








LQGXAGPDPTTVYVDMRALRHDRVRLVERGSPHSLPLMESGKIL' 
PGVKVIIAHTETKGPLGDSHLGEIWVSSPHNATGYYTVYGEEAL 
HABHFSARLSFGDTQTIWARTGYLGFLRRTELTDASGGRHDALy 
WGSLDETLEIiRGMRYHPIDIETSVIRAHRSIAECAWTWTNLL 
VWVELDGLEQDALDLVALVTNWIiEBKyLVVGVWIVDPGVIP 
INS RGEKQRMHLRDGFIiADQLDPI YVAYNM 


6807 


1444 


606 


v^HUrVHAMFTCFPKCLGFSPPVNVTVSPRSEESHTTTVSGGNG 
,SVFQAGPQLQAJbANLEARRGSIGAALSSRDVSGLPVYAQSGEPR 
RLTQAQVAAFPGENALEHSSDQDTWDSLRSPGPCSPLSSGGGAE 
SLPPGGPGHAEAGHLGKVCDFHLKHQQPSPTSVLPTEVAAPPLE 
KILSVDSVAVDCAYRTVPKPGPQPGPHGSLLTEGCLRSLSGntiN 
RFPCGMEVHSGQRELESWAVGEAMAXliKFPMGAMSYCIiRDRSR 
FLFRLPMGLSCPLQVQ 


6608 


2063 


737 


GVGSGAASAIARSRPIASRLSSRRRTRAPRSGAMQRI1AMDL.RML 
SREriSLYLEHQVRVGFFGSGVGXiSLILGFSVAYAFyYLSSIAKK 
PQLVTGGESFSRFLQDHCPWTETYYPTVWCWEGRGQTLLRPFX 
ITSKPPVQYRNELIKTADGGQISLDWFDNDNSTCYMDASTRPTI 
LLLPGLTGTSKESYILHMIHLSEEIiGYRCWFKNRGVAGENUJT 
PRTYCCANTEDIiETVIlIJIVHSLyPSAPFIiAAGVSMGGMLIiLNyL 
GKIGSKTPLMAAATFSVGWNTFACSESLEKPIjNWr,LFKYYIjTTC 
LQSSVNKHRHMFVKQVDMDHVMKAKS IREFDKRFTS VMFGYQTI 
DDYYTDASPS PRLKS VG IP VLCI.NS VDDVFS PSHAIP IKTAKQN 

PNVAIiVLTSYGGHlGFLEGIMPRQSTYMDRVFKQFVQAMVEHGH 
ELS 


6809 


939 


65 


DYSGQTPVPTEHGMTLYTPAQTHPEQPGSEASTQPIAGTQfVPQ 

TDEAAQTD'SQPhUPSm' TBKQQPKRLirVSNI PFRFRDPDX.RQMF 

GQFGKILDVEIIFNERGSKGPGFVTFETSSDADRAREKIiNGTIv" 

EGRKIEVNNATARVMTNKKrXGNPYTNGWKLNPWGAVYGPEFYA 

VTGFPYPTTGTAVAYRGAHLRGRGRAVYNTFRAAPPPPPIPTYG 

AVVYQDGFyGAEI\IiEATQPrDTLSPLQRRQPTATVTAESTQr.P 

TRTITPSGPRRPTALEPCETFHRFLLGP 


6810 


939 


65 


DYSGQTPVPTEHGMTLYTPAQTHPEQPGSEASTQPIAGTQTVPQ 
TDEAAQTDSQPLHPSDPTEKQQPKRLHVSNIPPRFRDPDLRQMF 
GQFGKILDVEIIPNERGSKGFGPVTFETSSDAZ>RAREKI)NGTIV 
EGRKIEVNNATARVMTNKKTGNPYTNGWKIiNPWGAVYGPEFYA 
VTGFPYPTTGTAVAYRGAHLRGRGRAVYNTFRAAPPPPPIPTYG 
AWYQDGFYGAE I \LEATQPTDTI^ PLQRRQPTATVTAESTQLP 
TRTITPSGPRRPTALEPCETFHRFI,LGP 




1S22 


658 


DZiVTVWSFVDCRVIASTHGH^KSWVSWAFDPYTTSVEEGDPME ' 

PSGSDEDFQDLLHFGRDRADSTQCRLSRRHSTDSRPVSVTYRFG 

SVGQDTQLCIiWDLTEDILFPHQPLSRARTHTNVMNATSPPAGSN 

GNS VTTPGNS VPP PLPRSNSLPHSAVSNAGSKS S VMDGAIASGV 

SKFATLSLHDRKERHHEKDHKRNHSMGHISSKSSDKLNLVTKTK 

TDPAKTLGTPLCPRMEDVPLLEPOCKKIAHERLTVLIFIiEDCX 

VTACQEGFICTWGRPGKWS FNP 


6812 " 


400X 


1682 


EDAVFSLDLSTIIQGTWFLNGEELKSNEPEGQVEPGALRYRXEQ 
KGLQHRLILHAVKHQDSGALVGFSCPGVQDSAALTIQESPVHIIj 
S PQDPCVSLTFTTS ER WLTCELSRVDFPATWYKDGQKVEES ELL 
WKMDGRKHRLlIiPEAKVQDSGEFECRTEGVSAFFGVTVQDPPV 
HIVDPREHVFVHAITSECVMLACEV\DR\EDAPVRWYKDGQEVE 
ESDFWLENEGPHRRLVLPATQPSDGGEFQCVAGDECAYPTVTI 
TDVSSWIVYPSGKVYV7VAVRLERWLTCELCRPWAEVRWTKDGE 
EWESPALtiLQKEDTVRRLVLPAVQLEDSGEYLCEIDDESASFT 
VTVTEPPVRIIYPRDEVTLIAVrrjECWIiMCEIiSREDAPVRWYK 
DGLEVE ESEAI,VI,ERDGPRCRIiVI*PAAQPEDGGE FVCDAGDDSA 
FFTVTVTEPPVQFIALETTPSPLCVAPGEPVVLSCEIjSRAGAPV I 
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SEQ 
ID 
NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C=Cy3teine^ D^Aspartic Acid, E=; 
Glutamic Acid, F-Phenylalanine, G^Glycine, 
H=Histidine, I^lsoleucine, K=Lysine^ 
1,= Leucine, M=Methionine, N~Asparagine, 
P-Proline, Qs=Glutamine, R-Arginine, 
S=Serine, T=Threonine, V^Valine, 
w=Tryptophan, Y«Tyrosine, X-Unknown, +=Stop 
Codon, /=po33ible nucleotide deletion, 
\=possible nucleotide insertion) ^ 








VWSHMGRPVQEGEGLELHAEGPRRVIiClQAAGPAHAGLYTCQSG 
AAPGAPSLSFTVQVAEPPVRWAPEAAQTRVIiSTPGGDLELWH 
LSGPGGPVRWYKDGERIiASQGRVQLEQAGARQVLRVQGARSGDA 
GEYLCDAPQDSRIFLVSVEEPLLViaiVSDLTPLTVHEGDDATFR 
CBVS PPDADVTWLRNCAVVTPGPQRQSCCS YGGCRMCGQRKART 
CVSKWRQAEWVQRGPCAGCEVGSPCPTTIACPWPRMGTSTASSS 
MVSYWPTRAPTAARATTIAPWPGSA 


6813 




836 


SSTQQRPGVPAGPRPLDGYLGVADHKPbKMHCRDCALVTSSGHL 
LHSRQGSQXDQTECVIRMNDAPTRGYGRDVGNRTSLRVIAHSSI 
QRILRNRHDLLhn/SQGTVFIFTtfGPSSYMRRDGKGQVYNNLHIJliS 
QVLPRLKAFMITRHKMLQFDELFKQETGQNNRKXSNTWLSTGWF 
TMTlALEIiCDR I NVYGMGP PDFCRDPNHPS VP YHYYEPFGPDBC 
TMYLSHERGRKGSHHRFITEKRVFKNWARTFNIHFFQPDWKPES 
lAINHPENKPVF 


6814 


3 


737 


KFRROEAN/ARERNRMHGLNDALDNLRKWPCYS KTUKLS KI ET 
LRUAKNYIWALSEILRIGKRPDLLTPVQNLCKGLSQPTTWLVAG 
CLQLNARS FLMGQGGEAAHHTRSP YSTFYP P YHS PELTTPPGHG 
TLDKSKSMKPYNYCSAYESFYESTSPECASPQFEGPLSPPPINY 
NGIFSLKQEETLDYGKNYNYGMHYCAVPPRGPUSQGAMFRIiPTD 
SHFPYDUILRSQSIiTMQDEI^NAVFHN 


6815 


906 


553 


QGLDPASQTKWELiLKDGSGRRGDRRSSRDMAGGAGPRSESDIjE 
DVGPTAEWNGDGSGSLRRSGSFGKLRDArjRRSSEMI.VKKLQGGT 
PQEPPNPRMKRASSIiNFLNKSVEEPTQPGG 


68X6 


1 


803 


NLLKTHKF\IaLGQDEDSLHSVPVAQMGNYQEYLKTI*ASPIiREID 

PDQPKRIiHTFGNPFKQDKKGMMIDEADEFVAGPQNKVKRPGEPN 

SPMSS KRRRSMS I^LLRKPQTPPTVTNHVGGKGPPSAS WFPSYPN 

LIKPTIiVHTDATI IHDGHEEKMENGQITPDGPLSKSAPSELINM ^ 

TGDIJ^PPNQ\7DSLSDDFTSL,SKDGLIQKPGSNAFVGGAKNCSLS 

VDDQKDPVASTLGAMPNTLQITPAMAQGINADIKHQLMKEVRKF 

GRSK 


6817 


172 


34S7 


LGMMDSPKIGNGIiPVIGPGTDIGISSLHMVCYLGKNFDSAKVPS 
DEYCPACKEKGKLKALKTYRISFQES I FLCEDLQCI YPLGSKSL 
NNLISPDLEECHTPHKPQKRKSLESSYKDSIiLIiANSKiCrRNYIA 
IDGGKVLNSKHNGEVYDETSSMliPDSSGQQNPIRTADSLERNEI 
LEADTVDMATTKDPATVDVSGTGRPSPQNEGCTSKLEMPIiESKC 
TSFPQALCVQWKNAYALCVnjDCII^ALVHSEEIjKOTVTGLCSKE 
ES I FWRLLTKYNQANTLLYTSQIjSGVKDGDCKICLTSEI faeiet 

clnevrde i fis lqpqlrgtligdmes pvpafplllkleth lekl 
flysfswdfecsqcghqyqnrhmkslvtptnvipewhpujaahp 

GPCNNCN b KS Q X RKM VIjB KVS P 1 tTib^ 

EGCLYQXTSVIQYRANNHFITWILDADGSWLBCDDIjKaPCSERH 
KKFEVPASE IH I VI WERKISQVTDKEAACLPIiKKTNDQHALSNE 

kpvsittscsvgdaasaetasvthpkdisvaprtlsqdxavthgd 
hllsgpkglvdnilpltleetiqktasvsqiinseafl\i»enkpv 
aentgilktntllsqesrmassvsapcmekllqdqfvdisfpsq 
wntwmqsvqlntedtvntksvnntdatgiiiqgvksvelekdaq 
lkqfltpkteqlkpervtsqvsnlkkkbttadsqtttskslqnq 
slkekqkkpfvgswvkglisrgasfmplcvsahnrntitdlqps 
vkgvnnfggpktkginqkashvskkarksaskpppiskppagpp 
ssngtaahphahaasevleksgstscgaqlnhssygngissanh 
edlvegqihkijlbkxrkklkaekkkxaalmsspqsrtvrsenije 
qvpqdgspndcesiedrilmelpypidianesacttvpgvslyss 
qtheeiiiaeiilsptpvstelsengegdprylgmgdshipppvps 
efndvsqnllilrqdhnycsptkknpcevqpdsltnnacvrtlnl 
espmktdifdeffsssalnalandtldlphfdeylfeny 


6818 


2 


240 


RGFDKVIiWT/ LS GAVK\CVQFSRIS PDGEEGYPGEbKVWVT YTL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
CO r responding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acrd segment containing sign-KT" peptiH^ 

^ariijie, *-=wyeceine, D=Aspartic Acid E- 

Glutamic Acid, F=Phenylalanine. G^^Glycine/ 

H-Hi3tidine, I=Isoleucine, K«=Lysine, 

L= Leu cine, M^Methionine, N=Asparagine 

P=Proline, Q-Glutamine, R=Arginine, ' 

S=:Serine, T=Threonine, V« Valine, 

Wt=TzryptODhan. Yj=TvT-f>«!-tTi** v_TTni^««« ^ 

zf*- ^4^**^4, iyrosxne, A=uriJCnown, *'»Stop 

Codon, /=possible nucleotide deletion, 

\«^possible nucleotide insertion) C 


6819 


1 




DGgE/LHS/ATTEHKP/VQATPVNLT\TILTS'tWQARl.PQX- 

C31PCTEMGNFDMANVl(Jh:iKFAIHYCFKTHSLEIClKACKNIAY 
c,r,rausjs.i,iv FX VKT XL li PDRSSQGKRKTGVQRNTVDPTFQETLK 
VQVAPAQLVTRQLQVSVWHLGTLARRVFLGEVIIPIATWDPEDS 
TTQSFRWHPUiAKADKYEDSVPQSNGELTVRAKLVLPSRPRKLQ 
EAQEGTI>QPSLHGQLCI*WZiGAKNLPVRPDGTtJSSPVKGCLTI,P 
CQQKLRLKSPVLRKQACPQWKHSFVFSGVTPAQLRQSSLELTWI 
DQALFGMNDRLLGGT XRLiGS KGDTAVGGDACSQSIMWQKVIiSS 
PNLWTDMTLVIiH < 


6820 


1014 


340 


GDMVYIVGHVPPGFFEKTQNKAWFREGFNEKYLIO/VRKHHRVIA 

GQFFGHHHTDSFRMLYDDAGVPXSAMPITPGVTPWKTTLPGWN 

GAWNPAlRVFEYDRATLSUa^maTrPWNLSCANAQGTPRWEIJBY 

QLTEAYGVPDASAHSMHTVLDRIAGDQSTLQRYYVYWSVSYSAG 

VCDEACSMQHVCAMRQVDIDAYTTCLYASGTTPVPQLPLriLMAL 
LGLCT 


6821 


1088 


518 


EFDIYR/EVGGEFVPVTRDDSSNGFPRTQHGPSPU'VritlQSPQN " 
RFCVLTLDPETLPAIATTLIDVLFYSMSTPKEAASSSPEPSSIT 
FFAFSLIEGYI \S IVMDAETQBCKFPSDLLLTSSSGELWRMVRIG 
GQPLGFDECGI VAQ I AG PLAAADlSAYYISTPNFDHAIiVPEDQl 
GS VI EVLQRRQEGIAS 


6822 


1088 


518 


EFDIYR/EVGGEFVPVTRDDSSNGFPRTQHGPSPrVHPIQSPQN 
RFCVLTliDPETLPAIATTL.IDVLFYSHSTPKEAASSSPEPSSIT 
FFAFSLIEGYI\SIVMDAETQKKFPSDLLLTSSSGELWRMVRIG 
GQPLGFDECGIVAQIAGPLAAADISAYYISTFNPDHALVPEDGI 
GSVIEVLQRRQEGLAS 


6823 


6S4 


221 


PPKLLSRWARMGHGDBIV\LSDIiNPPGrj:,HLPVVGPWRSVQTAC 
vaj.irvfi4ueirtVLHUjijfijD rYvESPAAVMELVPSDKERGLQTPVWTE 
YES I liRRAGCVRArAKIERFEFYERAKKAFAWATGETAIiYGNIi 
ILRKGVIAIiNPLL 


6824 


858 


104 


I^LAQR WGWG \ CCFFSIAVS VKMNVIibPAPGIJ^FI*Z-LTQFGraG 
ALPKLGICAGLQVVLGLPFI.LENPSGYIjSRSFDLGRQPLPHWTV 
NWRFLPEALFLHRAFHLALLTAHI.TX,LLLFALCRWHRTGESILS 
LLRDPSKRKVPPQPLTPNOIVSTLFTSNFIGICPSRSLHYQpyv 
WYFHTLPYLLWAMPARWLTHLI»RLLVLGt.IELSWNTYPSTSCSS 
AALHICHAVILLQLWIXSPQPFPKSTQHSKKAH 


682S 


3 


1173 


SSGEFGLQASDIMWTISDTGWILIIIiCSLMEPWALGACTFVHtL 
PKFDPr,Vir,KTLSSYPIKSMMGAPlVYRMl,LQQDLSSVKPPlU,Q 
NCLAGGESLLPETLEMWRAQTGLDIREFYGQTETGLTCMVSKTM 
KIKPGYMGTAASCYDVQIIDDKGNVLPPGTEGDIOrRVKPZRPI 
GIFSGYVDNPDKTAANIRGDFWLLGDRGIKDEDGYFQFMGRADD 
IINSSGYRIGPSEVENALMEHPAWETAVISSPDPVRGEWKAF 
VILAIiOFI^HDPEQLTKELQQHVKSVTAPYKYPRKIEFVXiNLPK 
TVTGKlQRA\KLRDKEWKMSGKAPCAVRHIiRDIKLDSPrtI»SLSP 
PFGPLALPiyCDGYGDSIjWEEHEYKFCIiALVrSTKI.YHVRC 


6826 


2304 


9S4 

< 


LKTESFKPW/VNIAIAFHLLGERASPNSFWQPYIQTLPREYDTP" 

LYFEEDEVRYLQSTQAIHDVFSQYKNTARQYAYFYKVIQTHPHA 

NKLPLKDS FTYED YRWAVSS VMTRQNQI PTEDGSRVTIiALlPLW 

DMCNHTNGLITTGYNLEDDRGECVALQDFRAGEQIYIFYGTRSN 

AEFVIHSGFFFDNNSHDRVKIKLGVSKSDRLYAMKAEVLARAGI 

PTSSVFALHFTEPP I SAQLIAFLRVFCMTEEELKEHIiLGDSAlD 

RIFTLGNSEFPVSWDWEVKLWTFLEDRASUCiLKTYKTTIEKDKS 

VLKNHDLS VRAKMAX KliRLGEKEILEKAVKSAAVNREYYRQQME 

EKAPLPKYEESKLGLLESSVGDSRLPLVbRKLEEEAGVQDALNI 

REAISKAKATENGLVNGENSIPNGTRSENESr^QESKRAVEDAK 

3SSSDSTAGVKE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C«Cyeteine, D=Aspartic Acid, 
Glutamic Acid, F== Phenylalanine, G==Glycine, 
H"Histidine, I=Isoleucine, K^Lysine, 
Ij=Leucine, M=Methionine, K^Asparagine^ 
P-Proline, Q-Glutamine, R-Arginine, • 
S-serine, T=Threonine, Vi^Valine, 
H=Tryptophan, Y -Tyrosine, X=:Unknown, *-Sfeop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) . 


6827 


1 


779 


SS VVE FGLS VLGGLFLLFVLENMLGLLRHRGLRPRCCRRKRRNL ■ 
ETRJNLDPENGSGMALQPLQAAPEPGAQGQREKNSQHPPALAPPG 
HQGHSHGHQGGTDITWMVLIiGDGIiHNLTDGLAIGAAPSDGFSSG 
LSTTLAVFCHELPHELGDFAMLLQSGIiSFRRLUiLSIjVSGALGL 
GGAVLGVGIiSLGPVPLTPWVFGVTAGVPLYVALVDMLPALFPSS 
GAPAYA\HVLU3Gr>GLLI^CLMIAITLLEERLLPVTTEG 


6828 


3 


1654 


KSQHG/WILQLMHSCKEOyVKDLKGNPGLHRAMLDIiDNGTRPSE 
LGKLSQTASLKRGSSFQSGRDDTWRYKTPHRVAFVEKLTKLVIiS 
QLPNFWKLWISYVNGSLFSETAEKSGQIERSKNVRQRQNDPKKM 
IQEVMHSLVKLTRGALIiPLSlRDGEAKQYGGWEVKCELSGQWIiA 
HAIQTTOLTHESLTALEIPlTOLLQTIQDLlLDLRVRCVMATriQH 
TAEEIKRLAEKEDWIVDNEGLTSLPCQPEQCIVCSLQSLKGVLE 
CKPGEASVFQQPKTQEEVCQLSINIMQVFIYCIiEQLSTKPDADI 
DTTHLSVDVSS PDIiFGS IHEDPSLTSEQRItlilVLSNCCYIiERHT 
FLNIAEHFEKHNPQGIEKITQVSMASLKEI.DQRLFENYIELKAD 
PIVGSLEPGI YAGY FDWKDCliPPTGVRNYLKEALVNI lAVHAEV 
FTI SKEtiVPRVLS KVIEAVSEELSRLMQCVSS FSKNGiALQARLE 

1 ICALRDTVAVYLTPESKSSFKQALEALPQLSSGADKKLLEEIiW 

: KFKSSMHLQLTCFQAASSTMMKT 


6829 


1 


782 


mrmeageaappagaggraaggwgkwvru^vggtvflttrqtlcr" 

eqksflsrlcqgeelqsdrdetgaylidrdptyfgpliinflrhg 

klvldkdmabegvij3iy^fywigpiiiriikdrmeekdytvtqvp 

PKHVYRVLQCQEEBIiTQMVSTMSDGWRFEQLVNIGSSYWYGSED 
QAEFLCWSKELHSTPNGLSSESSRiCTKSTEEQLEEQQQQEEEV 
EBVEVEQVQVEADAQEK/CCYKPEAPGCEAPDHLQGIX3VPI 


6830 


1 


939 


MEPGSVENI>SIVYRSRDFLWKiCHWDVRIDSKAWRETLTLQKQI*. 

RYRFPELADPDTCYGFRFCHQLDFSTSGALCVAliNKAAAGSAY?'^ 

CFKERRVTKAYLALLRGHIQESRVTISHAIGRNSTEGRAHTMCI 

EGSQGCENPKPSLTDLWLEIIGZ.YAaDPVSKVLLKPLTGRTHQL 

RV\HCSALGHP\A^aDLTYGEVSGREDRPFRMMI.HAFyiiRIPTDT 

ECVEVCTPDPFI,PSl4DACWSPHTLI*QSi:*DQr.VQALRATPDPDPE 

DRGPRPGSPSALLPGPGRPPPPPTKPPETEAQRGPCIjQWliSEWT 

LEPDS 


6831 


3 


1087 


SLFFGSSTPDKKVAEQEDI^TQPS PS VEKAVTVIDPEGTI PTNF 

NVAEKPADHSLSEVKLKTADEPRGTLVKSGDGQNVKEKSMILSN 

VEDLQQPKFISEVSREDYGKKEISGDSEEMKINSWTSAIXSENL 

EIQSYSIilGEKJbVMEEAKTIVPPHVTDSKRVQKPAIAPPSKWNI 

SIFKEEPRSDQKQKSLliSFDWDKVPQQPKSASSNFASKKITKE 

SEKPESIXLPVEESKGSLIDFSEDRIiKKEMQNPTSLKISEEETK 

uKov^if I c.i^iUJISJbEl>JR\S xTLr\AEKKVLAEKQNSV\APIj 

NEIGKTQITLGSRSTELKESKADAMPQHFYQNEDYNERPKIIVG 

SEKEKDEKKKK 


6832 


1809 


412 


MGSGLISGPPQDNSGEALKEPERAQEHSLPNFAGGQHFFEYLLV 
VSLKRKRSEDDYEP I ITYQFPKRENLLRGQQEEEERIiLKAI PliF 
CPPPGNEWASIiTEYPRETFSFVLTNVDGSRKIGYCRRIjLPAGPG 
PRLPKVYCIISCIGCFGXiFSKIL0EVEKRHQIS^5AVIyPFMQGL 
REAAFPAPGKTVTLKSFIPDSGTEPISIiTRPLBSHLEHVDFSSb 
LHCLSFEQILQIFASAVLERXIIFLAEGLSTI.SQCIHAAAALLY 
PFSWAHTYIPWPESIJaATVCCPTPFMVGVQMRFQQEVMDSPMS 
EVLLVNLCEGTFLMSVGDEKDXLPPKLQDDILDSLGQGINELKT 
AEQINEHVSGPFVQFFVKIVGHYASYIKREANGQGHFQERSFCK 
ALTSKTNRRFVKKFVKTQLFSLFIQEAEKSKNPPAGYFQQKILE 
YEEQKKQ/TETKGKNCEIRAWNKND 


6833 


1 


1129 


PLMTLSQCGGIPGHGHSHGGHGHGHGIiPKGPRVKSTRPGSSDIN 
VAPGEQGPDQEETNTLVANTSNSNGLKL0PADPENPRSGDTVEV 
QVNGNLVREPDHMELEEDRAGQUIMRGVTLHVLGDAIXJSVTVVV 



1 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
correspondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
{A=Alanine, C^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G^Glycine, 
H=Histidine, I=:Isoleucine, K=Lysine, 
L= Leucine. M=Methionine, N^Asparaglne, 
P= Proline, Q=Glutamxne, R==Arginine, 
S=Serine, T«=Threonine , V=Valine, 
M=Tryptoj)han, Y*^Tyrosine, X-Unknovm, *T=stop 
Codon, /=:possible nucleotide deletion, * 
\-possible nucleotide insertion) 








NALVFVFSWKGCSEGDFCVNPCFPDPCKAFVEriNSTHASVYEA 
GPCWVLYLDPTLCWMVCILLYTTYPLLKESALIUXJTVPKQID 
IRNXilKELRNVEGVEEVHELHVWQLAGSRIIATAHIKCEDPTSY 
ME VAKTI KDVFHNHG IHATTIQPEPASVGSKSS WPCELACRTQ 
CALKQCCGTLPQAPSGKDAEKTPAVSISCLEIiSNNLEKKPRRTK 
AENrPA\WlEIKN\IPNK\QPESSL 


6834 


78 


1151 


AGQI2RPAPIWRI.LWLPTPSVSRKAEPAHIPINR*GA*E*RGGLP 

LCGSSASAYGWH*RLTPWSPGGS*HM*SSKAPVTQAREVLVAGP 

CSKLVLSGARGIVGTTVQVLVEAQQPULLLFTGVWGLNLRAGEE 

SRAL*LIEEVTQVRDAHLGNAWGCAQCI>SQGQVGSALAKALLE 

AAAAVRDCKEVI*TVSGDKQOAEVSVRL*VRDVCVEEAGCVEFGQ 

AHGRPGLALAKGRGGTNEVEEQVQVDGVQKLVLSAHECHEIiVAG 

QQDGEDQAARTRLLQAGAHSVAHGRRQGQAPCRPHQEAGVSCHE 

LQQWGDAL*ARE*APQIIVLLLLEDVAQLRTGKKA*DI»WDVE 
OLLRQL 


&83S 


I 


834 


GlPAADRNEASLELIKLDiSRTFPNLCIFCKiGGPYHDMIJimS" 

AYTCYRPDVGYVQGMSFIAAVLILNLDTADAFIAFSNLUJKPCQ 

MAFFRVDHGLMnTYFAAPEVFFEENI.PKLPAHFKKNNLTPDIYIi 

I UWI FTL YSKSLPLDLACRI WDVFCRDGEEFI/FRTALGIIiKIiFE 

DILTKMDFIHMAQFLTRLPEDLPAEELFASIATIQMQSRNKKMA 

OVLTALQKDSREMREGKSVPPTIiRLQREFALGTNQSPMPRPIiCC 

FRLTPGQPRRTDAL 


6836 


1 


850 


MaCGRPPPDVDGMITLKV\,DNLTYRTSPDSLRRVFEK'SGRVGbV 

YI PREEJHTKAPRGFAFVRFHDRRDAQDAEAAMDGAELDGREIiRV 

QVARYGRRDLPRSRQGRRHAAGPEAA/RYGRRSRSYGRRSRSPR 

RRHRSRSRGPSCSRSRSRSRYRGSRYSRSPYSRSPYSRSRYSRS 

PYSRSRYRESRYGGSHYSSSGYSNSRYSRYHSSRSHSKSGSSTS"" 

SRSASTSKSSSARRSKSSSVSRSRSRSRSSSMTRSPPRVSKRKS 

KSRSRSKRPPKSPEEEGQMSS 


€837 


1 


1369 


TDGAAVAGNPGSDYFPGGTAP/GGPRTRRP\SGTSSSGSKA^GP 
PNPPAQGDGTSLSPKYTLESTSGNDGKPVSGGGGRGRGRRKRDS 
GEIVSPGTFPDKYSAAPDSPGAPGVSPGQQOASGAAVGGSSAGET 
RGAPTPHEKALTSPSWGKGAELLLGDQPDIjIGSLDGGAKSDSSS 
PNVGEFASDEVSTSYANEDEVSSSSDNPQALVKASRSPLVIGSP 

klpprgvgagehgpkapppalglgimsnststpdsygggggpgh 

PGTPGLEQVRTPTSSSGAPPPDEIHPIiEILQAQIQLQRQQFSIS 

edqplglkggkkgecavgas gaqkgds elgsccs eavks ams ti 
dldslmaehsaawympadkalvdsadddktlapwekakpqmpns 
keahdlpankasasqpgshlqci^svhctddvgdakarasvptwr 
slhsdisnrfgtfvaalt 


6838 


16 


499 


liTDTPPPKTHMIHHSlSDYKATbRCWALGFYPMEITLTWQQDEE ' 

oqtrdmelvetrpagdgtfqkmaavwpsgee/q/rymchvqhe 
glpepiitlrweqssqptlpivgivagiivllgawtgawsavwc 

RKKNSDRVSYSEAASSDHAQGSDVSLTACKV 


6839 


1 


119S 


AAPAGGGPDPEALSAFPGRHLSGI»SWPQVKRLDAI*LSEPIPIH6 
RGNFPTIiSVQPRQIRAGGPQHPGGAGMHVHRVRWIGSAASHVI, 
HPESGLGYKDLDLVFRMDLRSEASFQIiTKAVVIiACIilJ^FLPAGV 
SRAKITPLTLKEAYVQKLVKVerDSDRWSIiISLSNKSGKNVEI*K 
FVDSVRRQFEFSIDSFQIILDSLIiLFGQCSSTPMSEAFHPTVTG 

eslygdftealehlrhrvi atrs pee irgggliikychllvrgfr 
prpstdvralqrymcsrffidfpdlveqrrtleryleahpggad 
aarryaclvtuirvvnestvclml^errqtijdl i aaiju^qalae 
qgpaataaiawrppgtdgwpatvnyyvtpvqpliahayptwi.p 
an 


6840 


42S4 


2061 


EU3GDFSVPDVPKSMAWCENS ICVGFKRDYYLIRVDGKGS IKEL 
FPTGKQLEPLVAPLAIX3KVAVGQDDLTV\n^EGICTQKCALNW 
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1 SEQ 
ID 


fredxcted | Predicted end" 
beginning nucleotide 
nucleotide location 
location j corresponding 
corresponding to first 
to first amino acid 
amino acid | residue of 
residue of amino acid 
amino acid sequence 
sequence j 


/imino acid segment containing signal peptlHFl 
(A-Alanme, C=Cysteine, D=Aspartic AcidT 
Glutamic Acid, F=: Phenylalanine, G-Glycine, 
H~Histidine, I=Isoleucine^ K=Lysine 
L^Leucine. M=MethiQnine, N=Asparagine, 
P=^Proline, Q=Glutamine, R^Arginine, 
S=Serine, T-Threonine, Vr.valine, 
W-Tryptophan, Y=Tyro3ine, X=Unknovm, *=Stop 
^^^wii, /-possiDie nucleotide deletion 
\=possible nucleotide insertion) 






TDIPVAMKHOPPYIIAVLPRVVEIRTFEPRLLVQSIELQRPRFr- 

EMKDDSDSEKQQQIHHIKNLYAFNLFCQKRFDESMQVFA^Ti 

PTHVMGLYPDLLPTDYRKQI^YPNPLPVLSOAELEKAHrlz^ 

LTQKRSQLVKKLNDSDHQSSTSPLMEGTPTIKSKKKIiLQIIDTT 

J'vP^i^J^^^f'^^^^^'^^^^^^^^^KKAHKYSELl I 

LYEKKGLHEKAIX3VLVDQSKKANSPLKGHERTVQYI^HLGTEWL 

HLIFSYSVWVURDFPEDGLKIFTEDLPEVESLPRDRVLGFLIEN 

FKGLAIPYLEHIIHWJEETGSRPHNCLIQLYCEKVQGLMKEYLL 

SFPAGKTPVPAGEEEGELGEYRQiaLMPLEISSYYDPGRLICDF 

PFDGLLEERALLLGRMGKHEQALFIYVHILKDTRMABEYCHKHY 

DRNKIX3NKDVYI>Sr^YLSPPSIHCIX3PIKLELl,EPKAHLQAA 

IGNSAFARYPNGWVHYFCS\KEVWPADT 


[ 6841 
r 6842 




3206 


TPSTl-GTi<iJNrPTS]bi;^V^AAVTi'l.NKSt^PLGDYGVGSKNSKRA^ 

REKRDSRNMEVQVTQEMRNVSIGMGSSDEWSDVQDIIDSTPELD 

MCPETRUJRTGSSPTQGIVNKAFGINTDSLYHELSTAGSEVIGD 

VDEGADLI^EFSGMGKEVGNLLLENSQLIiBTKNAX^NVVKNDLlA 

KVDQLSGEQEVLRGELEAAKQAKVKLENRIKELEEELKRVKSEA 

IIARREPKEEAEDVS5YLCTESDKIPMAQRRRFTRVEMARVLME 

RNQYKERLMELQEAVRWTEMIRASREHPSVQEKKKSTIWQPFSR 

LFSSSSSPPPAKRPYPSGNIHYKSPTTAGFSQRRNHAMCPISAG 

SRPLEFFPDDDCTSSARREQKREQYRQVREHVRNDDGRLQACGW 

SLPAKYKQLSPWGGQEDTRMKNVPVPVYCRPLVEKDPTMKLWCA 

AGVNX,SGWRPNEDDAGNGVKPAPGRDPLTCDREGDGEPKSAHTS 

PEKKKAKELPEMDATSSRVWILTSTLTTSKWXIDANQPGTWD^' 

QFTVCNAHVLCIS^-V^DSDYPPGEMFLDSDVKPEDPGADGV 

LAGITLVGCATRCNVPRSNCSSRGDTPVLDKGOGEVATIANGKV 

NPSOSTEEATEATEVPDPGPSEPETATLRPGPLTEHVFTDPAPT 

PSSGPOPGSRWtTtPTrDncconTODntncirir^/^i-iTwi^* _ - ^ 
*rtj^v«rwc-\joiii>(Lri'iifiJi3SSTRFBPEPSGDPTGAGSSAAPTMWIjG 

AQNGWLYVHSAVANWiCKCmSIKLKDSVLSLVHVKGRVLVALAO 

GTLAI FHRGEDGQWDLSNVHLMDLGHPHHSIRCMAWYDRVWCG 

YKNKVHVIQPKTMQIEKSFDAHPRRESQVRQIAWIGDGVWVSIR 

ldstlrlyhakthqhlqdvdiepyvskmlgtgklgpsfvrital 

LVAGSRLWVGTGNGWlSXPLTETWlJmGQ\LLG\UyiNKTSP 

tsgeg\arpgg\iihvyg\ddssdraarsfipycsmaqaqlcfh 

GHRDAVKFFVS VPGNVLATLNGSVLDS PAFGPRP a a on cmrcf/^n 

io^rnvlvlsggegyidprigdgeddeteegagdmsqvkpvlska 


6843 


3 


926 


Kuwvi,j,/ViIliTDHQYLh;KTPL,CAILKQKAPQUyRXRAKLRSYKP 
RRLFQS VKLHCPKCHLLQEVPHEGDLDI XFODGATKTPDVKLON 
TSLYDSKIVnTKNQKGRKVAVHFVKNNGILPLSNECIiLLI^^ 
LSEICKLSNKFNSVIPVRSGHBDr^LLDLSAPFLlQGTVHHYGC 
KQWST*RSIQNLNSLVDKTSWIPSSVAEAU5XVPLQYVFVMTFT 
LDDGTGVLEAYLMDSDKFPQIPASEVLMDDDLQKSVDMIMDMPC 
PPGIKIDAYPWLECFXKSYNVTNGTDMQXCYOIFD'PTVaprnvT 


6844 


2 

244 


8S1 

J 

C 


NHKi^VLSGAKRYiiCNSCGKSFAYTSSLIKHRRIHfGERPYECSE''' 
'fSf^^^^^^^^^^'^^^^^CVECGKSFRRSSSLLQHQR 

>RKSSLIIHLRVHTGERPYECSDCGKSFAEITSSr,lKHLRVHTGE 
lPYECIDCGKSFflHSSSFRRHQRVHTGMRPYK*SKFWKFSCPGF 

.LLCX3QRVHTGSRpYECDKWGIFFS*NASFFT*KSAPTEErVPFE 
3IECEKAPSPr»SLVTTXFT 






642 * I 

c 


.Hwi^v^i-iibKKTQTSMSLGTTREKTDRVKSTVVYLSPQELEDVFY 
?YDVKS5XYSFGIVLtfEXATGDIPFQGCNSSKIRKIiVAVKRQQE 
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SEQ 
ID 
NO: 


t'reciicted 
beginning ' 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


i^ino acid segment containing signal p^ptlHS"^ 
(A.Alanrne C=Cyateine, D=Aspartic AotaT^t 
t^^lT^ F.Phenylalanine. G^Glycine, 
H-Histidxne, 1-Isoleucine, K-Lysine, 
L=Leucine. M-Methionine, N^Asparagine , 
P=Proline, Q=Glutamine, R-Arginine, 
S=Serane, T=Threonine, Vt^Valine, 
W^Tryptophan, Y=Tyrosine, X^Unknown, *.stop 
Codon, /opoGsible nucleotide deletion ' 
\-.pos3ible nucleotide insertionl ' * 


6845 






PLG£DCPSELREIli;KCi^PSVRPSVDElLm5TFSK*ClK'^ 


6846 


3 


1S19 


VAVRDBCYWUHWi^WOgDi.WMLLFILMCHP£TARARLEYRlRTI,D 

CjI^FELY YHTTQDLQL PRE AGG WD V\m A VA P PWf^GDxrrjM e. r.r> « V. 
KYHIJRGVMSPDEYHSGVNNSVYTWVLVQNSLRFAAALAQDLGLP 
IPSQWLAVADKrKVPFDVEQNFHPEFDGYEPGEWKQADWLLG 
^fJf!r^^^^°^^^^^^^^5P<5GPAMTWSMFAVGWMELKD 
AVRARGLLDRSFANMAEPFKVWTENADGSGAVNFLTGN5GGFLOA 
WFGCTGFRVTRAGVTPDPVCLSGISRVSVSOIFYQGMKLNFSF 
SEDSVTVEVTARAGPWAPHLEAELWPSQSRLSLLPGHKVSFPRS 
AGRIQMSPPKLPGSSSSEFPGRTFSDVRDPLQSPIrffVTLGSSSP 
TESLTVDPASE^SGTGASETSLGPSLWPRLHPPLIiGTIiIiACHPS 
PAARLSGKVHAAWPEFKAFCL ^i^^nf^^ 


6847 


213 


1258 


i.):ti.KTIK*I.NRI^HP*YENEia*TKLRm'IMEQYTRTEESARG 
I IFTKTRQSAYALSQWITENEKFAEVGVKAHHLIGAGHSSEFKP 
MTQNEQKEVISKPRTGKINTLLIATTVAEEGLDIKECNIVIRyGL 
VTNEIAMVQARGRARADESTYVLVAHSGSGVIEHETVNDFREKM 
l^^^^^^^^^^^^^^^^^^^QMQSIMEKKMKTKRNIAKHYK 
NNPSLlTFLCKNCSVIiACSGEDIHVrEKMKmTMM'Ptjwwr.Y v-rtr 

RENKTLQKKCADYQINGEirCKCGQAWGTMMVHKGIODL 
KFWVPKNNSTKKQYKKWVELPITFPNLDYSECCLPSDEr) 


6848 


14 50 


348 


SHCWN£,UKi.I:;MPLIDLALILYPPSYVPYXGHLSDDSLSRKYCLT ' 
WFEDALNGVL*RAEAIQPHCVllAGDRMEKFRQlCYWNKI*QTI*RQO 
PFAYGTLTVRSLLDTREHCLNEFKFPDPYSKVKQRENOVALRCF 
PGVVRSI^AIfiWEERQIJ^VKGLLAGNVFDWGAKAVSAVI.EqnP 

YFGFEEAKRKLQERPWLVDSYSEWLQRLKGPPHKCAHFADNSG 

IDIIU3VFPFyRELLLRGTEVXIACNSGPAI^VTHSESLIVAE 

RIAGMDPWHSALREERLLLVQTGSSSPCLDLSRLDKGLAALVR 

ERGADLWIEGMGRAVHTtnfHAALRCESliKLAVIKMAWLAERLG 
GRLFSVIFKYEVPAE 




6849 


19 


16 


AMWWN^-LDGIRNIVLSNPKKRKTLSIAMlJCSLQSbll^A^^ ' 

LKVI 1 1 SAEGP VFSSGHDLKELTEEQGRDYHABVFQTCSKVMMH 

IRKHPVPV3AMVNGLATAAGCQLVASCDIAVASDKSSPATPGVN 

VGLPCSTPGVALARAVPRKVALEMLFTGEPISAQEAtiLHGLLNK 

WPEABLQEETMRrARKIASLSRPWSrXSKATFYKQLPQDLGTA 

YYLTSQAMVDNliALRDGQEGITAFLQKRKPVWSHEPV*VEH 




6850 


70 


821 


bI^VDGSCLh:OGiJPAPRPQTDTSP*PVGNWATQQEDi:YHQSyEC 
VCVLFASVPDPKEFYSESWINHEGLECLRLLNEIIADPDELLSK 
PKFSGVEKIKTIGSTyMAATGLNAT<5Grinartnnai?oc:r'erTT /-**pm 

VEFAVALGSKLDVTNKHSFNWFRLRVGLNHGPWAGVIGAQKPQ 
YDIWGMTVNVASRMESTGVLGKIQVTEETAWALQSLQYTCYSRG 
VIKVKGKGQLCTYFLNTDLTRTGPPSATLG 




6851 " 


2 


1235 

: 
I 


AKCLNHEWTFEiCLRQHISRNAQDKQELHLFMLSGVPDAVFDLTD 
LDVLKLELIPRAKI PAKISQMTNLQELHLCHCPAICVEQTAPSFl* 
RDHLRCLHVKFTDVAEIPAWVyLLKNLRELYLIGNLNSBNNKMr 
GI^SLRELRHLKILHVKSNLTKVPSNITDVAPHLTKLVIHNDGT 
KLLVLWSLKKMMNVAELELQNCELERIPHAIFSLSNLQELDIiKS 
MNIRTIEEIISFQHLKRLTCLKLWHNKIVTIPPSITHVKNLESI* 
SfFSNNKLESLPVAVFSLQKLRCLDVSYNNlSMIPIEICLLQNLQ 
^tiHITGNKVDILPK:QI.FKCIKLRTLNLGQbICITSLPEKVGQLSQ 
-TPQLELKGNCLDRLPAQLGQCRMLKKSGLWEDHLFDTLPLEVK 
2ALNQDINIPFANGI 






1765 


660 ; 

I 
C 


/SAQVSAREGENCLGWNLADSSQESYKSLEEAEDCYPPSLLTLD 
jRDLFNQVEQGPLLSCPKAGTDLSMGRAREVGWMAAGLMIGAGA 
nrcVYKLTIGRDDSEKLEEEGEEEWDDDQELDEEEPDtWFDFET 



562 
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SEQ I predicted 
ID I beginning 
WO: I nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



6852 



68S3 



6B54 



1148 



6855 



6856 



1913 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



469 



585 



1148 



1617 



6858 



997 



669 



ffi- segment containing signal peptide 

(A-Alanine, C-Cysteine, D==A3partic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Hxstidine, l=IsoIeucine, K^Lysine, 
L-Leucine, M^Methionine, N^Asparagine 
P=Proline, Q=Glutamine, R-Arginine, 
S=Serine, T^^Threonine, V^Valine, ' 
W^Tryptophan, y=Tyrosine, x==Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) ' C 

MARPVTODGI>WTEPGAPGGT fc;UKPSGGGKANkAHt>IKQRPFPYE 
HKNTWSAQNCKNGSCVLDLSKCLFIQGKLLFAEPKDAGFPFSQD 
INSHLASLSMARNTSPTPDPTVREALCAPDNLWASIESQGQIKM 
riNEVCKET VSRCCNS FLQQAGLNLLXSMTVINNMLAKS ASDLK 

FPLISEGSGCAtCVQVLKPLKGLSEKPVLAGBLVGAQMLFSFMSL 
P2RNGNREILI.ETPAP 

RTRGEKTYANFlKHNDGKNlFyAARTPA TLFAVWFAMYIISGLT ' ' 
GFIGLNSIAVI,CNLVMGLALIFLCTWAyVKySGEFREIGTVIDQ 
I AETLWEQVLKPLGDNLMEENIRQS VTNS IKAGLTDQVSHHARL 
KTD 

GDSCAVCXEhYKPUbL VHILTCNHI FHKl <JVDL>WUUUHRTCPMC 
KCDlLKALGIEVDVEDGSVSr^VPVSNElFHSASSHEBDNRSBT 
ASSGYASVQGTYEPPLEEHVQSTNESLQLVNHEAJTSVAVDVIPH 
VDNPTFEEDEYPNQETAVREIKS 

HKSYIGTFDPGEDCVCAAIQWLQDNSASYFliNRKLVYEPSTQAK" 
PVKNTFLRMWIYSHHIYQQDIJ^KKILDVGKRLDVTGFCMTGKPG 
IICVEGFKEHCEEFWHTIRYPNWKHISCKHAESVETEGNGEDLR 
I*FHSFEELLLEAHGDYGLRNDYHMNI>GQFIiEFLKKHKSEHVFOT 
LFGIESK5SDS 

GRVGGRVGRICSP]:.SGANEyiASTDTI.K THEVLIiFTDQTDDLAK ' 
EEPTSLFQRDSETKGESGl,VIiEGDKEIHQlFEDI.DKKLALASRF 
YIPEGCIQRWAAEMWALDAtxHREGIVCRDLNPNNII,LNDRGHI 
QLTYFSRWSEVEDSCDSDAIERMYCAPEVGAITEETEACDWWSL 
GAVLFELLTGKTLVECHPAGINTHTTLNMPEWVSEEARSLIQQL 
LQFNPLERLGAGVAGVEDIKSHPFFT PVDWAELMR 

VTnT.V\7g\yr>iV g'PL^r\g'T vTrrr^n- nr t^.^^T^..^ : u ' 




6860 



1889 



6861 



1889 



1150 



"isIs" 



1515 



J. V x«iix uvRAWW VU15LQAYAQLVSLGNPDFIEVKGVTYCGESSA' 
SSLTMAHVPWHEEWQFVRELVDI.IPEYEIACEHEHSNCLI.IAH 
RKFKIGGEWWTWINYKRFQELrQEYEDSGGSKTFSAKDYMARTP 
HWALFGASERGFDPKDTRHQRKN KSKAISGC 
KGPEATAMVCVCSHPNCRQ KHXKPSHSAAQl-WCGSPTPASAPNH " 
KLMAMEQGKTLPSATEDAKEEGLEAQISRLAELIGRLESKALWF 
DLQQRLSDEDGTNMHLQLVRQEMAVCPEQLSEFLDSLRQYIiRGT 
TGVRNCFHITAVRLSDGFTPVlYEFWETEEAWKRHLQSPiCKAF 
RHVKVDTLSQPEAIiSRlLVPAAWCTVGRD 

RSRGIKDFENPPPLSSCGIFQSRIA GDALLDSGIRISSVFASPA 

LRCVQTAKLlLEELKLEKKrKIRVEPGIFEWTKWEAGKTTPTLM 

SLEELKEANFNrDTDYRPAFPLSALMPAESYQEYMDRCTASMVQ 

IVNTCPQDTGVILIVSHGSTLDSCTRPLLGLPPRECGDFAQLVR 

KIPSLGMCFCEENKEEGKWELVNPPVKTLTHGAMATUfNWRNWrS 
GN 

GETMFKKAKTKAKKKPRKRSDSSGG X WLSDI IQS PSSTGLLKSG 

fCTNSVESLPELLTSDSEGSYAGVGSPRDU3SPDFTTGFHSDKrE 

AKVKPYVNGTSPVYSREDLKPWEKSPILKISAPQPIPSNRIDTT 

SSASWVAGSPSPVSPPWDIJlTlMEIEESROKCGATPKSHLGiCT 

VSHGVKLSQKQRKMIALTTKENNSGMNSMErVLFTPSKAPKPVN 

AWASSLHSVSSKSFRDFLLEEKKSVTSHSSGDHVKKVSFKGIEN 

SQAPKIVRCSTHGTPGPEGNHISDLPLLDSPNPtlLSSSVTAPSM 

VAPVTFASIVEEELQQEAALIRSREKPLALIQIEEHAIQDLLVF 

YEAFGNPEEFVIVERTPQGPLAVPMWMKHGC 

UKDKKRQKKKUIFPKVATNIMRAWLFQHLTHPYPSEEQKKQLAQ 

DTGLTXLQVNNWFINARRIIVQPMIDQSNRAVSQGAAYSPEGQP 

MGS FVLPGQQHMG XRPAGPMSGMGMNMGMDGQWH YM 

DKDKKRQKKRGIFPK^ATNIMRAWIiFQHliTHPYPSEEQKKQLtAQ 

DTGLTILQVNNWFINARRIIVQPMIDQSNRAVSQGAAYSPEGQP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D^Asparbic Acid, 
Glutamic Acid, F- Phenylalanine, G==Glycine, 
H-Histidine, I-Isoleucine, K»Lysine, 
li=Leucine, M=MethLonine, N=Asparagine , 
pH Proline, Q=Glut amine, R^Arginine, 
S -Serine, T=Threonine, VwValine, 
W^Tryptophan, Y-Tyrosine^ X^Unknown, *«Stop 
Codon, /spossible nucleotide deletion, 
\=possible nucleotide insertion) * 








MGS FVLDGQQHMGIRPAGPMSGMGMNMGMIXKIWHYM 


6862 


2 


471 


EE I DRE FHNKLKLKEDKLEKQEKP VNGEDKGDSGVDTQNS EGNA 
DEED PLC PNCYYDKTKS FFDNISCDDNRBRRPTWAEERRLNAET 
FG I PLRPNRGRGGYRGRGGLGFRGGRGRGGGRGGTFTAPRGFRG 
GFRGGRGGREFADFEYRKTTAFGP 


6863 


2216 


487 


PQEPALKSEFSQVASWTIPLPLPQPNTCKDNGPCKQVCSTVGGS 
AI CSCFPGYAIMADGVS CEDQDECLMG AHDCSRRQFCVNTLGSF 
VCVTIHTVLCMGY ILNAHRKCVDI NECVTDIJITCSRGEHCVNTL 
GSFHCYKALTCEPGYAIiKDGECEDVDECAMGTHTOQPGFLCQNT 
KGSFYCQARQRCMDGFLQDPEGNCVDINECTSliSEPCRPGFSCI 
KTVGS YTCQRNPL ICARGYHASDDGTKCVDVNECETGVHRCGEG 
QVCHNLPGSYRCDCKAGFQRDAFGRGCIDVNECWASPGRLCQHT 
CEMTLGSYRCSCASGFLIJU:UDGKRCEDVNECEAQRCSQECANIY 
GSYQafCRQGYQLAEDGHTCTDIDECy\QGtf«3ILCTFRCI*MVPGS 
YQCACPECKSYTMTANGRSCroVDECAIiGTHMCSEAETCHNIQGS 
FRCLRFECPPNYVQVSKTKCERTTCHDFLECQWSPARITHYQIiN 
FQTGLbVPAHI FRIGPAPAFTGDTI AliNI IKGNEEGY FGTRRLN 
AYTGVVYLQRAVLEPRDFALDVEMKIjWRQGSVTTPIAKMHIPFT 
TFAli 


6864 


2 


2933 


LADSSPSNLQI I IKELLSMHHQPDPALTKEFDYLPPVDSRSSSG 

FVGLRNGGATCYKNAVFQQLYMQPGIiPESLLSVDDDTDNPDDSV 

FYQVQSLFGHLMESKLQYYVPENFWKIFKMWNKEIiYVItEQQDAY 

EFFTSbXDQMDEYLICKMGRDOIFKNTFQGIYSDQKICKDCPHRY 

EREEAFMALNLGVTSCQSLEISLDQFVRGEVLEGSrJAYYCEKCK 

EKRITVKRTCIKSLPSVLVIHLMRPGFDWESGRSIKYDEQIRPP 

WMLNMEPYTVSGWARQDSSSEVGEWORSVDQGGGGSPRKKVALT 

ENYELVGVIVHSGQAHAGHYYSPIKDRRGCGKGKWYKPNDTVIEh 

EFDLNDETLEYECFGGEYRPKVYDQTNPYTDVRRRYWNAYMIiPY 

QRVSDQNSPVLPKKSRVSWRQEAEDLSLSAPSSPEISPQSSPR 

PHRPNNDRLS ILTKLVKKGEKKGLFVEKMPARI YQMVRDEWLKF 

MK^nRDVYSSDYFSFVLSlASLNATKLKHPYYPCMAKVSX/QIiAIQ 

FLFQTYl^RTKKKIiRVDTEEWlATIEALLSKSFDACQWLVEYPlS 

SEGRELlKIFriXjECWVREVRVAVATILEKTLDSAbFYQDKLKSt. 

HQLLEVLIALLDKDVPENCKNCAQYFFLFWrFVQKQGlRAGDttL 

LRHSALRHMI S FLLGASRQNNQ IRRWSS AQAHEFGNLHNTVALL 

VLHSDVSSQRNVAPGIFKQRPPISlAPSSPLLPLHEEVEAIiIiFM 

SEGKPYLLEVMFALREbTGSLlAALIEMVVYCCFCNEHFSFTMIiH 

FIKNQbETAPPHELKNTFQIjLHEir*VIEDPIQVERVKFVFETEN 

w>uiJi/>jj"innonri vi^^oxviui ji^v^ v tjv x iJtv^vAitirrunMAJii t, s xvuMwnn 

WSWAVQWIiQKKMSEHYWTLQSHVSNETSTGKTFQRTlSAQDTIA 

YATALLNBKEQSGSSNGSESSPANENGDKHLQQGSESPMMIGKl, 

RSDLDDVDP 


6865 


1820 


1242 


DPERWKHLSKVTPPGSSVSTTPVQWRLQSPQSQGSMMPSCNRS 
CSCSRGPSVEDGKWYGVRSYLHLFYEGYAVPPKLEGIGEGEFIiV 
LDQRAADYNQALGTCRLAGTAIiCVAAGVLIAIdiFWAMXGWLSQ 
DTKAEPLDPEADSHVEVFGDEPEQQLSPIFRNASGQSWFSPPAS 
PFGQSSVQTIQPKRDS 


6866 


1571 


495 


DCPRPRYTliYGLRATCMRDLDWAWINAVSAFKALEQDIiPVNIKF 
IIEGMEEAGSVALEEIiVEKEKDRFFSGVDYIVISDNLMISQRKP 
AITYGTRGNSYF^n;EVKCRDODFHSGTFGGILHEP^lAI)IJVAIlLG 
SIiVDSSGHIL.VPGIY0EVVPLTEEEINTYKAIHLDIiEEYRNSSR 
VEKFLFDTKEE ILMHIiWRYPSLS IHGIEGAFBEPGTKTVt PGRV 
IGK^SIRLVPHt^SAVEKQVTRiniEDVFSKRNSSmCmrVSMTI* 
GIJIPWIANIDDTQyiiAAKRAIRTVFGTEPDM IRDGSTI P I AKMF 
QEIVHKSWLI PI/3AVDDGEHSQNEKINRWNYIEGTKLFAAFPL 
EMAQr4H 



4 

564 
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SKQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to firsc 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 

corresponding 
to first 

CtinX4*^ J. Ldl 

residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=.Alanine, C=Cyateine, 0-Aspartic Acid, E=; 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M-Methionine, Mt=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, ' , 
S-Serine* T=Threonine, V=Valine, 
W^Tryptophan, y=Tyrosine, X-Unknown, *«Stop 
Codon, /==possible nucleotide deletion^ ^ 
\=:po3sibae nucleotide insertion) * 


6867 


2833 


1704 


GTRIMSQPKQKEIAGFVRQKMLLDYSVYMGRCVPQESRSPQRSP 

LQSAESSPTAGKKLPEVPPSEEEEQEAWVKALLGRIFWDFLGEK 

YWSDLVSKKIQMKLSKIKIiPyFMNELTLTELDMGWAVPKII^AP 

KPYVDHQGLWIDLEMSYNGSFLMTLETKMNLTKLGICEPLVEALK 

VGEIGKEGCRPRAPCLADSDEESSSAGSSEEDDAPEPSGGDKQL 

LPGAEGyVGGHRTSKimFVDKITKSKYFQKATBTEFIKKKIEEi 

VSNTPLLLTVEVQECRGTIiAVNIPPPPTDRVWYGFRKPPHVELK 

ARPKLGEREVTLVHVTDWIEKKLEQEFQKVFVMPNMDDVTITIM 
HSAMDPRSTS CLLKDPPVEAADQP 


686B 


1 


346 


RPTRPPTHPjaEIKNLILPYISDMNFyQDLCEPFyELFKTDKGFlj"' 
KATFESQMSVMRGQILNLTQAI4UX3KSPF^I,VQIPCVIVERSQG 
GSQGR I VHLSNS FTQTVNCRKPFFSS W 


6869 
6870 


3 


1619 


MYMERMDKRALISFWESVBHLKNAKKNEIPQIjVGEIYQNFFVES 
KEISVEKSLYKEIQQCLVGNKGIEVFYKIQEDVYETLKDRYYPS 
FiVSDIiYEKLLIKEEEKHASQMISNKDEMGPRPEAGEEAVDEXST 

nqineqasfavwklrelnekleykrqalnsiqnapkpdkkivsk 

LKDEIILIEKERTDLQLHMARTDWWCENX^GMWKASITSGEVTEE 

NGEQLPCYFVMVSLQEVGGVETKKWTVPKRLSEFHNLHRKLSEC 

VPSLKKDQLPSLSKLPFKSIDHTFMEKFENQLNKFLQNLLSDER 

LCQSEALYAPLSPSPDYLKVIDVQGKKNSPSLSSFIiERLPRDFF 

SHQEEETEEDSDLSDYGDDVDGRKDALAEPCFMIilGEIFELRGM 

FKWVRRTIilALVQVTFGRTINKQIRDTVSWIFSEQMLVYYINIF 

RDAFWPNGKLAPPTTIRSKEQSQETKQRAQQKLLENIPDMLQSL 

VGQQNARHGIIKlFNALQETRANKHLLYALMELLLlELCPEUiV 
HLDQLKAGQV 




1 


1566 


MAAWAATRWWQLI*LVLSAAGMGASGAPQPPNILXiLU4DDMawa ' 

DLG VYGEPSRETPNriDRMAAEGLLFPNFYSANPLCSPSRAALLT^ ' 

GRLPIRNGPYTTNAHARNAYTPQEIVGGIPDSBQLLPELIiKKAG 

YVSKIVGKWHLGHRPQFHPLKHGFDEWFGSPNCHFGPYDNKARP 

NIPVYRDWEMVGRYYEEFPINLKTGEANLTQIYLQEAriDFI KRQ 

ARHHPFFLYWAVDATHAPVYASKPFLGTSQRGRYGDAVREIDDS 

IGKILELLQDLHVADNTFVFFTSDNGAAIiISAPEQGGSNGPPIiC 

GKQTTFEGGMREPAIiAWWPGHVTAGQVSHQLGSIMDLFTTSLAL 

AGLTPPSDRAIEKSLNLLPTLI^RLMDRPI FYYRGDTI>1AATLG 

QHKAHFWTWTNSWENFROGIDPCPGQNVSGVTTHNIiEDHTKLPL 

IFHLGRDPGERFPLSFASAEYQSALSRITSWOQHQEALVPAQP 

QLNVCNWAVMNWAPPGCEKZiGKCLTPPESIPKKOUWSH 


6871 
6872 


209 


1126 


rmslnppiflkrseensskfvetkqsqttsiasedplqnlcLas 

QEVLQKAQQSGRS KCLKCGGSIiMFYCYTCYVPVENVPIEQI PLV 
KLPtKlDIIKHPNETDGKSTAlHAKLLAPEFVNIYTYPCIPEYE 
EKDHEVAI.rPPGPQSISIKDISPHLQKRIQNNVRGKNDDPDKPS 
FKRKRTEEQEFCDLNDSKCKGTTLKKI IFIDSTWNQTNKIFTDE 
RLQGLLQVEIjKTRKTCFWRHQKGKPDTFLSTIEAiyYFLVDYHT 
DILKEKYRGQYDNLLFFYSFMYQLIKNAKCSGPKETGKLTH 


G873 


880 


459 


FGhLmVhS LI FMKGNCVREDliI FNFLFKIiGliDVRETNGLFGNT 

KKLlTEyFVRQKYLEYRRIPYTEPAEYEFIiWGPRAFXiETSKMLV 

LRFLAKLHKKDPQSWPFHYLEALAECEWEDrDEDEPDTGDSAHG 
PrSRPPPR 




1929 


955 


DEQAVLCSKDKTYDLKIADTSNMLbFIPGCKTPDQLKKEDSHOr" 

IIHTEIFGFSNNYWELRRRRPKLKKLKKLLMENPYEGPDSQKEK 

DSNSSKYTTEDLlJ)QIQASEEElMTQr*QVLWACKIGGYWRILEF 

DYEMKLLNHVTQLVDSESWSFGlCVPLNrrCLQELGPLEPEEMIEH 

CLKCYGKKYVDEGEVYFELDADKrCRAAARMTiLQNAVKFNLAEF 

QEVWQQSVPEGMVTSLDOLKGLALVDRHSRPEIIFLLKVDDLPE 

DNQERFNSLFSIiREKWrEEDIAPYIQDLCGEKQTIGAIiLTKYSH 

SSMQNGVKVYWSRRPIS 



- - ^ 
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ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first 

residue of 
amino acid 
sec[uence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acad segment containing signal peptiae~ 
(A=Alanine, C=Cysteine, D=7^partic Acid, B= 
Glutamic Acid, F«Phenylalanine, G^Glycine, 
H==Histidine, I=lsoleucine, K=Lysine, 
li=lieucine, M=:Methionine, N=Asparagine , 
P= Proline, Q=Glut amine,. R=Arginine, , 
S-Serine, T=Threonine, V« Valine, 
W«Tryptophan, Y^rTyrosine, X^Unknown, *-Stop 
Codon, /«po33ible nucleotide deletion, 
\=;possible nucleotide insertion) ^ 


6874 


1 


3 07 


DSlADHVNSAAVNVEEGTKNLGKAAKYKLAALiPVAGALIGGMVG 
GP IGLtiAGFKVAGIAAAUSGGVLGFTGGKLTQRKKQKMMEECLTS 
SCPDLtPSQTDKKCS 


687S 


1688 


349 


VIGTGERGNSASEKWEIMFWEELGDPFIIIHSISLIiNAEEHSIA 
TLLLRIEKEELDMKGSGrYVSIiEWVTISKKKQDNKKYEIIKRDI 
LRGKSVPHyAAIEPDGKGIjMlVSYKSLTFVQAGQDt,EENMDEDI 
SEKIKEPLYYWOQTEDDLTVTIRLPEDNTKEDIQIQFLPDHINI 
VLKDHQFLEGKLYSSIDHESSTWIIKESNSLEISLIKKNEGLTW 
PELVIGDKQGELIRDSAQCAAIAERLMHLTSEELNPNPDKEKPP 
CNAQELEECDrFFEESSSLCRFDGKTLKTTH\A^WLGSNQyi.PSV 
IVDPKEMPCFCLRHDVDALLWQPHSSKQDDMWEHIATFNALGYV 
QASKRDKKFFACAPNYSYAALCECLRRVFIYRQPAPMSTVttYNR 
KEGRQVGQVAKQQVASLETNDPILGFQATNERLFVLTTKNIjFltl 
KVNTEN 


6876 


41 


1285 


VGEMTIilWRHLtiRPLCLVTSAPRILEMHPFLSLGTSRTSVTKLS 
LHTKPRMPPCDFMPERYQVIFLWSGSEANKIJVMLM7VRAHSNNI 
DI I S FRGAYHGCSPYTLGLTNVGI YKMELPGGTGCQPTMCPDVF 
RGPWGGSHCRDSPVQTIRKCSCAPDCCQAKDOYIEQFKDTJCSTS 
VT^XAGFFAEPIQGVNGWQYPKGFLKEAFELVRARGGVCIAN 
EVQTGFGRXiGSHFWGFQTHDVLPDIVTMAKGIGKfGFPMAAVITT 
PEIAKSLAKCIiQHFNTFGGNPMACAIGSAVIiEVIKEENLQENSQ 
EVGTYMIiTjKFAKIiRDEFEIVGDVRGKGLMIGIEMVQDKISCRPL 
PREEVNQIHEDCKHMGLLVGRGSIFSQTFRIAPSMCITKPEVDP 
AVEVFRSALTQHMERRAK 


6877 


1 


778 


GTS PS PARAYAPPTERKRFYQNVS ITQGEGGFEINLDHRKLKTP 
QAKLFTVPSEAIAIAVATEWDSQQDTIKYYTrjHLTTLCNTSIjDfJ 
PTQRNKDQLIRAAVKFLDTDTICYRVEEPETLVEIiQRNEWDPI I " 
EWAEKRyGVEISSSTSIMGPSlPAKrREVIiVSHLASYNTWAUX3 
lEPVAAQLKSiWLTLGIiIDIiRLTVEQAVLIiSRIiEEEYQIQKWGN 
lEWAHDYEI^ELRARTAAGTLFIHLCSESTTVKHiCLLKE 


6878 


931 


263 


QTLQGDFKNRAEMI DFNIRI KNVTRSDAGKYRCBVSAPSEQGQN 
LEEDTVTLEVLVAPAVPSCEVPSSALSGTVVEIiRCQDKEGNPAP 
EYTWPKIXSIRLLENPRLGSQSTNSSYTMNTKTGTIiQFNTVSKbD 
TGE YS CEARNS VGYRRCPGKRMQVPDIiNISG I lAAWWAliVlS 

II 


6879 


3 


845 


IRVIGESDIWQEFLSESDENYNGVSDVEJbRVALPDGrrVTVRVK 
KKSTTlXiVYQAIAAKVGMDSTTVNYFALFfiVISHSFVRKLAPlsrE 
PPHKI^YIQNYTSAVPGTCLTrRKWLFTTEEEILLNDNDIiAVTYF 
PHQAVDD\^KGYIKAEEKSYQIX5KLYEQRKMVMYLNMLRTCEGy 
NEI I PPHCACDSRRKGm'^I T7VIS ITHFKLHACTEEGQLENQVIA 
FEWDEMQRWDTDEEGMAFCFEYARGBKKPRWVKIFTPYFNY^5HE 
CFERVFCELKWRKEEY 


6860 


2110 


1437 


RKDNCTAKEWTFPEAKWWTTARVFSHIRLGMGHVLIIVQCFI SS 
MANIYNEKILKEGNQLTESIFIQHSKLYFFGIIiFNGLTLGLQRS 
NRDQl KNCGFFYGHRAFSVALIFVI'AFQGLSVAFIIjKFLDNMPH 
VLMAQVTTVIITTVSVLVFDFRPSLEFFLEAPSVIir»SIFlYNAS 
KPQVPEYAPRQERIRDLSGNLWERSSGDGEEIjERIjTKPKSDESD 
EDTF 


6881 


2638 


2244 


NDS KWEDI HVI TGAIiKMFFRELPEPI*FTFNHFNDFVNAIKQE PR 
QRVAAVKDLIRQbPKPNQDTMQILPRHLRRVIENGEKNRMTYQS 
lAIVFGPTLLKPEKETGNIAVHTVYQNQrVELILLELSSIFGR 


6882 


1 


850 


GI PEAQLWIYP VKS CKGVP VSEAECTAMGliRSGNLRDRFWIiVIN 
OBGNMVTARQEPRLVr/ISLTCDGDTIiTLSAAYTKDXiLriPIKTPT 
TNAVHKCRVHGLEIEGRDCGEATAQWITSFLKSQPYRLVHFEPH 
MRPRRPHQIADLFRPKDQIAYSDTSPFLlLSEASIiADIJJSRLEK 
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SEQ 
ID 
NO ; 


Predicted 
beginning 
nucleotide 
location 
correeponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" " 
{A=:Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, Phenyl alanine, G=Glycine, 
H=^Histidine, I^jlsoleucine, K^^Lysine, 
L=lieucine, M^Methionine, N=Asparagine, 
P- Proline, Q-Glut amine, Rs^Arginine, 
S=Serine, T-Threonine, V™ Valine, 
W^Tryptophan, Y^Tyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) > 








KVKATNFRPNIVISGC0VyAEDSWDEI.I,IGDVELICRVMACSRCl"" 
LTTVDPDTGVMSRKEPIiErLKSYROCDPSERKLYGKSPr.FGQyF 
VLEN PGT I KVGDPVYLLGQ 


6883 
6884 


2794 
2 


2256 
99 


NSKliKLNQNLKLFITLTYQVLSLHGWGPGIHU^KEGAFPVXQNR 

ALQLLYDLRyLNIVLTAKGDEVKSGRSKPDSRIBKVTDHLEALI 

DPFDLDVFTPHLNSNLHRLVQRTSVLFGLVTGTENQIjAPRSSTF 

NSQEPHNILPLASSQIRFGLLPLSMrSTRKAKSTRNIETKAQYD 
ANC 

EFERVTAEAVKPRETSEPRAAAQRFCEKFPFIj ■ ■ 


6885 


297 


15S4 


ETGQFWHVTDLHLDPTYHITDDHTKVCASSKGANASNPGPFGDV" 

LCDS PYQL I IiS AFDFI KNSGQEAS FMl WTGDS PPHVPVPELSTD 

TVINVITNMTTTIQSLFPNLQVFPAIiGNHDYWPQDQIiSWTSKV 

YNAVANLWKPWLDEEAISTLRKGGPYSQKVTTNPNLRI ISUTTN 

LYYGPNIMTLNKTDPANOrEWLESTLNNSQQNKEKVYllAHVPV 

GYIiPSSQNITAMREYYWEKLlDIFQICYSDVIAGQFYGHTHRDSI 

MVLSDKKGSPVWSLFVAPAVTPVKSVLEKQTNNPGIRLFQYDPR 

DyiCLLDMLQYYI.NLTEANLKGESIWiCLEYIl4TQrYDIEDLQPES 

IjYGLAKQFTILDS KQPI KYYNYFFVS YDSSVTCDKTCKAFQI CA 

IMNLDNISYADCLKQIiYIKHNY 


6886 


2 


1341 


qcggipgreggssrpleegtgsspacvrgaapgsedafyptrak 
qarvsqelkkaakrtvs is egpdtiigdgmrerketiialapepep 

LE KEACEKWKR ?FRS AS ATS IiTLSHCVDVVKGl*l*DFKKRRGHSI 

GGAPEQRYQl 1 PVCV/y^LPTRAQDVLDAHLSEVNAVRFGPNSS 

LIiATGGADRLIHLWNWGSRLEANQTI^EGAGGSITSVDFDPSGY 

QVI^TYNQAAQLWKVGEAQSKETLSGKKDKVTAAKFKLTRHQA 

VTGSRDRrVKEWDLGRA YCSRTINVLS YCNDWCGDHII ISGHN 

DQKIRFWDSRGPHCTQVIPVQGRVTSLSIiSHDQLHI,LSCSRDNT^- 

LKVIDI^RVSNIRQVFRADGFKCGSDWTKAVFSPDRSYAIAGSCD 

GALYIWDVDTGKLESRLQGPHCAAVNAVAWCYSGSHMVSVDQGR 

KWIiWQ 


6887 


1047 


116 


WTARPSQKPFWEAGAVPGDPLSTGCSQAQLGGGCPRGPWGPQHG 
GQQRAAGPTLPRGERGGPQQSGPGIiAAQTPPTSKQVAWRAFLTG 
TyRSQSPRSPAGPFRGGTGWWPSPAVCLCVAVGPQRLSSPGLVY 
NASGSEHCYDIYRLYHSCADPTGCGTGPDARAWDYQACTEINIjT 
FASNNVTDMFPDLPFTDELRQRYCLDTWGVWPRPDWIjLTSFWGG 
DLRAASNX I FSNGNLDPWAGGGIRRNLSASVI AVTIQGGAHHIiD 
LRASHPEDPASWEARKTiEATIIGEWVKAARREQQPALRGGPRL 
Sli 


6888 


1 


992 


FVAYVKKEIPHIWTHCLLNPHALVIKTLPTKXRDALFTWRVI 
KFIKGRAPNHRLFQAFFEEIGIEYSVLLPHTEMRWLSRGQILTH 
IFEMYEE INQPLHHKSSNLVDGFENKEFKIHIAYIiADLFKHLNE 
LSASMQRTGMNTVSAREKLSAFVRKFPFWQKRIEKRNFTNFPFL 
EBI I VSDNEGIFIAAEITLHLQQLSNFFHGYFS IGDI*NBASKWI 
LDPFIjFNIDFVDDSYLMKNDLAEIiRASGQILMEFETMKLEDFWC 
AQFTAFPNUVKTALEILMPFATTYIiCELGFSlTPTFQNKVPEAA 
LILSDDIRVArSKKVPSFLGHH 


6883 


1 


1534 


LTLENQI KE EREQDNS ES PNGRTSPLVSQNNEQGSTLRDLLTTT 
AGKLRVGSTDAGIAFAPVYSMGAPSSKSGRTMPNILDDIIASW 
ENKIPPSKTSKINVKPELKEEPEESIISAVDENIJKLYSDIPHSW 
IGEKHILWLKDYKNSSNWKLFKEa^KQGQPAWSGVHKKMNISL 
WKAES ISLDFGDHQADLLNCKDS I ISNANVKEFWDGFEEVSKRQ 
KNKSGETWI^KLKDWPSGEDFKTMMPARYEDIiLKSLPLPEYCNP 
EGKFNIASHLPGFFVRPDLGPRLCSAYGWAAKDHDIGTTNtiHI 
EVSDWNILVyVGIAKGNGILSKAGIUCKFEEEDLDDIIiRKRriK 
DSS E I HGALrai YAGKDVDKIREPLQKISKEQGLEVIiPEHDPIR 
DQSWYVNKKLRQRLLEEYGVRTWTItlQFLGDAIVLPAGALHQVQ 



567 



wo 01/53312 



PCT/USOO/34263 



SEQ 
ID 
NO: 


Predictea 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide ' 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cyateine, D^Aapartic Acid, E= 
Glutamic Acid, F= Phenylalanine, C=Glycine, 
H=:Hi8tidine, I = Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N*:Asparagine, 
P=Proline, Q=Glutaraine, R-Arginine, , 

S— Serin.& TrsTViT'firi'n'iT^o tr-_\7?i 1 

" A *" A*ix titjnine , vwvaxine, 
W=Tryptophan, Ys^Tyrosine, X==0nknown, *«Stop 
Codon, /=:pos3ible nucleotide deletion, 
\-possible nucleotide insertion) ^ 








NFHSCIQVTEDFVSPEHLVESFHIiTQELRLLKEEINYDDKI^Vk 
NILYHAVKEMVRALKIHEDEVDDMEEN 


€850 


3 


667 


THACGMWI PLYLHRALWHKTAETCNSPPCGAKDSLI rCAlTCF " 
TG FLGVDTGAG ATRWCRLKTQRADPLVCAVGMLGSAI FICLI FV 
AAKSS I VGAYI Cr FVGETLLFSNWAITA0ILMWV1 PTRRATAV 
ALQSPTSHLLGDAGSPyLIGFlSDLIRQSTKDSPLWEPI.SLGYA 

LMLCPFWVLGGMFFLATArjFFVSDRARAEQQVNQLAMPPASVK 
V 


6891 


1980 


1262 


LR I HQELLS KELKLLRG ITIES 1 1 H IGLAAGKEcJfmQDASNVMQ" 

lllktqshlynmednnpevrqaaaygtgvhaqfggddvrslcse 
avpllvkvi krahsktkkkviatencisaigkilkpkpncvnvd 
evlphhlswlpuiedkeeaiqtlsflcdliesnhpvvigpnnsn 
LPKT isiiaegkinetinyedpcakrlanwrqvqtsedlwlec 

VSQLDDEQQEALQELLNFA 


6692 


3 


876 


RSVAAASGPGAWGTDHYCLEbLRKRDYEGYLCSLtiLPAESRSSV 
FALRAFNVELAQVKDSVSEKTIGIiMRMQFWKKTVEDIYCDNPPH 
QPVAIELWKAVKRHNLTKRWLMKIVDEREKNLDDKAYRNIKELE 
NYAENTQSSLLYXTLE II^I KDLHADHAASHIGKAQGIVTCLRA 
TPYHGSRRKVFIiPMDiCMUIGVSQEDPUlRNQDKNVRDVIYDIA 
SQAHLHLKHARSFHKTVPVKAFPAFLQTVSLEDFIjKKIQRVDFD 
IFHPSLQQKNTLLPIiYIiYIQSWRKTY 


6893 


1 


842 


DGERKSMSVERTFSEINKAEEQYSLCQEL.CSEIiAQDLQKKRL.KG 

rtvtiklknvnfevktrastvsswstaeeifaiakellkteid 
tu^fphplrlrlmgvrissfpneedrkhqqrs I igflqagnqals 
atectlektdkdkfvkplemshkksffdkkrserkwshqdtfkc 
eavnkqsfqtsqpfqvlkkkmnenleisensddcqiltcpvcfr 
aqgcislealnkhvdecldgpsisbnfkmfscshvsatkvnkke'' 
nvpasslcekqdyeah 


6694 


1742 


1463 


TTLCKPLVPREHQ F YETLPAEMRKFT P Q YKG KB QLLEGLPH WRG 

dvrdrghgrpwqpslepslpptlcfpslssfssswpsaqhltps 

VFNPW 


6895 


2379 


478 


VTYVELCDLASPTALLIMRTVLDLIVEDLQSTSEDKEQQYTSQT 
TRLLALXiYALASHKACKLAILHLINGTIKGDERYAErPQDLIiAL 

V'RSPCI'n^\7TPriOr^\7'B'V\rTCTT AOr j"*nr\i>TIVT tt •ne'er^vt/^f^-r ^t^r 
^ ri-o*-vjfi^cj V j.«.yy\^vxix V AoiJuyoijt^ijyiJiALiXjjPSSSEGSlSEL 

eqlsnslpnkelkts icdcblatlansessynclltcvrtmmfl 
aehdyglfhlksslrknssalhsllkrwstfskdtgelassfl 
efmrqilnsdtigccgddnglmevegahtsrtmsinaaelkqll 
qskeespeniifleleklvlehskdddnldslldswglkqmles 

SGDPLPLSDQDVEPVLSAPESLQMLFWNRTAYVLADVMDDQLKS 

mmftpfqaeeidtdlolvkvdlielsekccsdfdlhselersfl 
sepsspgrtkttkgfklgkhkhetfitssgkseyiepakrahw 

PPPRGRGRGGFGQG IRPHDIFRORKQNTSRPPSMHVDDFVAABS 

KEWPQDG I pppkrplkvsqkissrggfsgnrggrgafhsqnrf 
ftppaskgnysrregtrgsswsaqntprgnynesrggqsnfwrg 
plpplrplsstgyrpsprdrasrgrgglgpswasansgsggsrg 
kfvsggsgrgrhvrsftr 


6896 


1 


555 


GN I VIQKKKYNKQH 1 1 PLENVTI DS I KDEGDLRNGWLIKTPTKS 

favyaatateksewmwhinkcvtdllsksgktpsnehaavwvpd 
seatvcmrcqkakftpvnrrhhcrkcgfwcgpcsekrfllpsq 

S S KP VRICDFCYDLLSAGDMATCQPARSDS YSQSLKS PLNDMSD 
DDDDDDSSD 


6897 


3 


920 


gdglmhewnglmerpdwetaiqkplcslpagsgnalaaslnhy 

AGYEQVTNEDLLTNCTLLLCRRLLSPMNLLSLliTASGLRLFSVL 

slawgfiadvdlesekyrrlgemrftlgtflrlaalrtyrgrla 
ylpvgrvgsktpas pvwqqgpvdahlvpleepvpshwtwpde 

DFVLVLALLHSHIJGSEMFAAPMGRCAAGVimjPYVRAGV'SKAML 
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1 SEQ 
ID 
NO: 


Predicted 
jaecfinning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
eequence 


ireaicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acad segment containing sicrnai peohT^j — 
(A=Alanine, C=Cy.teine, D.As^artfc Lidf E^" 
Glutamic Acxd, F.Phenylalanine, G^GlyciAe? 
H«Histxdine, I^Isoleucine, K=Lysine/ 
L==Leucine. M^Methionine, N^Asparagine. 
P=Proline. Q^Glutamine, R^Arginine, 
S=Serine, T=.Threonine, V=Valine 
W.Tryptophan, Y^Tyrosine, X^Unk^own, *.stop 
Codon, /^possible nucleotide deletion ^ 
\=possable nucleotide inserfcir>n) ' :! 


6898 






l.ULFI^EKGRHMEYECPYl.VyVPVVAFR]:>EPia)GKGVFAVnrr 


6B99 


9X9 


346 


UK1-VTAVA5LLKGRQGI YTENERRMGAVIKlkI-i^'KlWl.Vl r tom 
LSNirNESLLFYLEMQTDlNGGSLKPVR^^i^^IMGTT^^^ 

Ogfllsiafycwtgcsi^fosprkeiqweS^^p^^ 


6900 


120 
3 


827 


TFOTKPFVGGKLHRVTAEVKHNITNTWCRVQGEWNSVLEPTYS 
NGETKYVDLTKlAVTKKRVRPLEKQDPFESERLWra^DS^f 


6901 




4Sl 




6902 


1 
2 


201 




6903 




267 


U/^^i^FPPSQppR^pp^AAPSSHPHSDr■TPNP'^^&TT^^^i^^^^■.^^;t — 


6904 


1 
464 


149 


i^lNu^jiRUbPTGlHII.VIDQMVUNKUDK5CPLP<;TUKAt,--Q6rv-.C"- 
HXXLK ^'*"'^"*- VftAfc.SSDGI 


6905 " 




2092 


VGNFFGSTQDAEMEEYKlGIKKAPIQTYVI^AmQETVKY 

DGCEIAENITYLGRKGIFTGSSGI.QIVYl,SGTESI^^3EPVPG^^^^ 

SPKDVSSLRMMLCTTSQPKGVDILLTSPWPKCVGNFGNSSGE^^^ 

AQHATRFIAIANVGNPEKKKYLYAFSIVPMKLMDAAELVKQPPD 
VTENPYRKSGQEASIGKQILAPVEESACQPPPDLNEKQGRKRSS 
rGRDSKSSPHPKOPRKPPOPPGPCMPClIsPEV;^^^^ 

RRFFKSRGKWCWFERNYKSHHLQLQVIPVPISCSTTDDIKDAP " 
ITOAQEOQIELLEIPEHSDIKQIAQPOAAYFYVELDTGEKLFHR 


6906 


1 


226 


,''^^r\°^^'^^^^^^^^''^V^^^^^J^FNWIWRYi^^ 
VAGLVQOVLYCDFFYLYXTKVLKGKfaSLPA 


6907 


3 

.2 


611 


^ii:!;!t?"''"^°^^*^'^^^^^^SIEPADRFICTKRIAGKIIPAI 
ATTTATVSGLVALEMIKVTGGYPFEAYKNWFLNlAiPTWF^P'r 
TEVRKTKIRNGISFTIWDRWTVHGKEDFTLLDFINAVKEKYOIE 
^^^^^^^^^P^^^KRLKLTMHKLVKPTTBKKYVDLTV 
SFAPDIDGDEDLPGPPVRYYFSHDTD 






2228 " J 
< 

I 
•J 


l^Kyv^vv.A/.UAFRFSSGE£sf^HLIMSRR5QRlH^YSQGDDDGS " 

J?^fr^^^^^^^^^^^^S^^^^^«GI5ANWGEDLRVRRRRGT 
JGSESSRASGLVGRKATEDFLGSSSGYSSEDDYVGYSD;mQQSS 
>SRLRSAV5RAGSLLWMVATSPGRLFRXa.yWWAGTTWYRSTAA 
LI.DVFVLTRRFSSLKTFLWFLLPLLLLTCL.TYGAWYFYPYGLO 

RRLEALAAEFSSNWQKEAMRLERLELROGAPGQGGGGGLSHKD 

nf^^. ^^^^^^^'^^^^^"Q^MTQESPQESSVICEI^ 

QliAGI^EIAAtxALKQSSVAEEVGLLPQOIOAVRDDVESQFPA 

ISQFIARGGGGRVGLLQKEEMQAQLRELESKILTHVAEMQGKS 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

c o rre spond i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding" 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptodi — 
(A=Alanine, C^Cysteine, D=Aspartic Acid, E^s 
Glutamic Acid, F- Phenylalanine, G=Glycine, 
H-Histidine, I-Isoleucine, K= Lysine, 
LsLeucine , M-Me thionine , N~Asparagine , 
P=Proline, Q=Gl\itamine, R==Arginine, 
S^Serine, T»»Threonine, V=Valine, ' 
W*»Tryptophan, Y=Tyrosine, X=Un)aiown, *»Stcip 
Codon, /rspossible nucleotide deletion, 
\*=possible nucleotide insertion) C 








AREAAASLSLTLQKEGVIGVTKEQVHHIVKQALQRYSEDRIGUi"" 

DYALESGGASVXSTRCSETYETKTALLSLPGIPLWYHSQSPRVl 

LQPDVHPGNCWAFQGPQGFAWRLSARIRPTAVTLEHVPKALSP 

NSTISSAPKDFAIFGFDEDLQQEGTLLGKFTYDQDGBPIQTFHF 

QAPTMATYQWELRILTNWGHPEYTCIYRFRVHGEPAH 


6308 


3 


780 


QVPSAAWLMAVCGLGSRliGliGSRLGLQGCFCSAARLLyPRFQSRG 
PQGVEDGDRPQPSSKTPRrPKlYTKTGDKGFSSTFTGERRPKDD 
QVFEAVGTTDELSSAIGFALELVTEKGHTPAEELQKIQCTLQDV 
GSALATPCSSAREAHLKYTTFKAGPILELEQWIDKYTSQLPPLT 
APILPSGGKISSALHFCRAVCRRAERRWPLVQMGETDANVAKF 
LtJRLSDYLPTLARYAAMKEGNQEKIYKKNDPSAESEGL 


6909 


3 


409 


GRLLAVGTDIiYGQRSSAPEQELIiVQDATPVSNSLliPEKAFSDIP 
SPYLRGTIKMMQAVRQAFQDQDDRRTWDGRPLTMAATFDDCliYA 
LCWDTIKRSSQTGEWQNXAIMTEEPELSPAYLISEAMRRSRMS 
LYC 


6910 


1 


1068 


LVPVWIDSYYYGKLViAPliNIVLYNIFTPHGPDLYGTEPWYFY 
LlNGFIiNFNVAPALALLVtiPLTSLMEYLLQRPHVQNLGHPYWLT 
LAPMYIWFl 1 FPIQPHKEERFLFPVYPtilCIiCGAVALSALQHSF 
LYFQKCYHFVFQRYRLEHYTVTSNWlAIiGTVFLFGLLS FSRS VA 
liFRGYHGPIiDLYPEFYRIATDPTIHTVPEGRPVNVCVGKEWYRF 
PSSFIiliPDNWQUSFIPSEFRGQLPKPFAEGPLATRIVPTDMNDQ 
NliEEPSRYIDISKCHYLVDLDTMRETPREPKYSSNKEBWIStAY 
RPFLDASRSSKLLRAPYVPFLSDQYTVYVNYTIUCPRKAKQXRK 
KSGG 


6911 


1184 


966 


GEDABEMETGNVANXilSIFGSSFSQLLRKSPGGGREEEEGEBSQ 
PEAAEPGQICCDKPVLRDMNPWSTAIVAF 


6912 


1 


844 


AMKPVETHSFQMLFTILSTGSAI.KAQSYEDAYRCIKSSIl.U3sV 
i3*o»j lUiiiju^ wijtrtNt SLPVi KGEIQARNIjGMAVEAWNEEGKAVW 
GESGELVCrkP I PCQPTHFWNDENGNKYRKAYFSKFPGIMAHGD 
YCRINPKTGGIVMLGRSDGTLNPNGVRFGSSEIYKXVESFEEVE 
DSLCVPOYNKYREERVTT.PT.KMAQr'waTrntJnT vif D Tona TTiiumT 

SARHVPSLILETKGI PYTLKGKKVEVAVKQI lAGKAVEQGGAFS 
NPETLDLYRD I PELQG F 


€913 


1643 


. 1558 


KKSHEESHKEELSYGAQASLPLPCSDFR 


6914 


12S1 


615 


ELAAECKSAGYPGTLIPYRCDLSNEEDILSMFSAIRSQHSGVDI 
CINNAGLARPDTLLSGSTSGWKDMFWVNVLALS 1 CTREAYQSMK 
ERNVDDGHI IKl NSMSGHRVLPLS VTHFYS ATKYAVTALTEGLR 
QELREAQTHIRATCISPGWETQFAFKLHDKDPEKAAATYEQMK 
CLKPEDVAEAVIYVIiSTPAHlQIGDIQMRPTEQVT 


6915 


254 


652 


GRSLSFKTFLIWVLISIYQGGILMYGALVLFESEFVHWAISFT 
ALILTELLMVALTVRTWHWLtWVAEFIiSLGCYVSSLAFLNEYFD 
VAFITTVTFLWKVSAITWSCLPLYVLKYLRRKLSPPSYCKLAS 


6916 


254 


652 


GRSLSFKTFLIWVLISIYQGGILMYGALVLFESEFVHWAISPT 
ALILTELLMVAXiTVRTWHWLMWAEFLSLGCYVSSLAPLNEYFD 
VTlFITrVTFLWKVSAITWSCLPLYVLKYLRRKLS PPS YCKIiAS 


6917 


254 


652 


GRSLSFKTFLIWVTilSlYQGGILMYGALVLFESEFVHWAISFT 
AlilLTELLMVALTVRTWHWLMWAEFLSLGCYVSSlAFLKEYFD 
VAFITTVTFLWKVSAITWSCLPLYVX.KYLRRKLSPPSYCICLAS 


6918 


28 


921 


PEAGTRSWREPDPEDLRRFLLSAACRSPPQWXiPGGGGGQVSSCS 
DTDVpyiiLnAVKSEPGRFAERQAVRETWGSPAPGIRIjLFLLGSP 
VGEAGPDLDSLVAWESRRYSDLLTiWDFLDVPFNQTLKDLIiLIaAM 
LGRHCPTVSFVLRAQDDAFVHTPAlAiAHLRALPPASARSI.YLGE 

vftqamplrkpggpfyvpesffeggypayasgggyviagrlapw 
llraaarvapfpfedvytglciralglvpqahpgfltawpadrt 
adhcafrnlllvrplgpqasirlwkqlqdprlqc 



f ■ 
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SEQ 
ID 
NO; 


Predicted 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid 3egment containing signal peptide" 
(A^Alanine, C^Cysteine, D-Aspartic Acid, E- 
Glutamic Acid, F=Phenylalanine, G=Glycine. 
H^Histidine, I=Isoleucine, K^Lysine, 
L-Leucine , M=Me thionine . N^Asparagine , 
P=^Proline, Q=Glutamine, R=Arginine, 
S«Serine, T=Threonine, Valine, 
W=.Tryptophan, Y=Tyro3ine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, * 
\=possible nucleotide insertion) ^ 


6913 
6920 


850 
1418 


41 
591 


QGRKELSGSVFCPFIQQEPKEMLTbSEYHERVRSQGQQ'LQQLQA 
ELDKU1KEVSTVRAANSERVAKLVFQRLNEDFVRKPDYAI.SSVG 
ASIDLQKTSHDYADRWTAYFWWRFSFWNYARPPTVILEPHVFPG 
WCWAFEGDQGQWIQLPGRVQLSDITLQHPPPSVEHTGGANSAP 
RDFAVFFLLSFFTHOGLOVYBETFVST/^K'pnr'pmrpTfci?Trifn»tiT 

QNDPPAAFPKVKIQILSNWGHPRFTCLYRVRAHGVRTSEGAEGS 
AQGPH 

EAQGPSKVHLTLKKKK " ^ — 


6921 


2 


1711 


MNATRSEEQFHVINHAEQTLRKMENYLKEKQLCDVLLIAGHLRI 
PAHRLVLSAVSDYFAAMFTNDVLEAKQEEVRMEGVDPNAliNSLV 
vi.t\j.x\av usjur^tLU X J, t b 1-JjAAACLIiQLTQV I D VCSNFL I KQLHP 

SNCLGIRSFGDAQGCTELU^VAHKYTMEHFIEVIKNQEFLLI.PA 

NEISKLLCSDDINVPDEETIFHALMQWVGHDVQNRQGELGMLLS 

YIRLPr.LPPQLIADIiETSSMFTGDLECQKLLMEAMKYHLliPERR 

SMMQSPRTKPRKSTVGALYAVGGMDAMKGTTTIEKYDLRTNSWli 

HIGTMNGRRLQFGVAVIDNKLYWGGRDGLKTLNTVECFNPVGK 

IWTVMPPMSTHRHGLGVATLEGPMYAVGGKDGWSYIJ^TVERWDP 
EGR0WNYVASMSTPRftT\7(^Wa.T MMVT vii-r/^/-tTiTv^o*>rtT 

FDPHTNKWSLCAPMSKRRGGVQVATYNGFLYWGGHDAPASNHC 
SRL,SDCVERYDPKGDSWSTVAPI>SVPRDAVAVCPLGDKLYWGG 
YIJGHTYiNTVESYDAQRNEWKEEVPVWIGRAGACVVWiCLP 


6922 


1075 


369 


IiTPPAGIRHEVRDREREREREREREKFPIiDSTGSEI.KC}NIHSIT 

GLPP7VMQKVMYKGLAPEDKTLREIKVTSG7UCIMGGGSTINDVIA 

VNTPKDAAQQDAKAEENKKEPLCRQKQHRKVLDKGKPEDVMPSV 

KGAQERLPTVPLSGMYNKSGGKVRLTFKLEQDQLWIGTKERTEK 

LPMGSIKNWSEPlEGHEDYHMMAFQIiGE*TBASYYWVYWVPTOY * 

VDAIKDTVIiGKWQYF 


6933 


2469 


1660 


LGLFCILPlDTLCAVIiERDTLSIRBSRLFGAWRWAEAECQRQQ 
LPVTFGNKQKVLGKALSLIRFPJLMTIEEFAAGPAQSGILSDREV 
VWLFLHFTVWPiCPRVEYIDRPRCCLRGKECCXJMRFQQVESRWGY 
SGTSDRIRFTVNRRISIVGFGLYGSIHGPTDYQVNIQIIEYEKK 
QTLGQNDTGFSCDGTANTFRVMFKEPIEILPNVCYTACATLKGP 
DSHYGTKGLKKWHETPAASKTVFFFFSSPGNHNGTSIEIXSQIP 
EIIPYT 


6924 


2210 


1235 


PEERVICFVEYYLTAFHEGRKGAIiAKKPYNPIIGETFHCSWEVP 
KDRVKPICRTASRSPASCHEHPMADDPSKSYKLRFVAEQVSHHPP 

iscfyceceekrlcvnthvwtkskfmgmsvgvsmigegvi.rij:>e 

HGEEYVFTIiPSAYARSILTIPWVELGGKVSlNCAKTGYSATVIF 
HTKPFYGGJCVHRVTAEVKHNPTNTIVCKAHGEWWGTt.EFTYNNG 
ETKVIDTTTLPVYPKKIRPLEKQGPMESRNLWREVTRYIjRLGDr 
urtAiiiUiQ<HljtEKQRVEERKRENLRTPWKPKyFIQEGDGSGIliQ 
SPLESTIjMGLEVQSFPV 


6925 
6926 


2 
1 


1653 
733 


RGGAAGAAMEPDSVIEIiKTIELMCSVPRSbWLGCANLVESMCAL 

sclqsmpsvrclqisngtssvivsrkrpsegmyqkekdlcikyf 

DQMSESDQVEFVEHLISRMCHyQHGHINSYLKPMLQRDFITALP 

eqgldhiaenilsyldarslcaaelvckbwqrvisegmlwkkli 

ERMVRTDPLWKGLSERRGWDQYLFKNRPTDGPPNSFYRSLYPKI 
IQDIETIESNWRCGRHNLQRIQCRSENSKGVyCLQYDDEKIISG 
IJM3NSIKIWDKTSLECXiKVLTGHTGSVLCLQYDERVlVTGSSDS 
rVR\mDVmX5EVLm'IiIHHNEAVLHLRFSNGLMVTCSKDRS lAV 
MDMAS ATDITLRRVliVGHRAAVNWDFDDKYIVSASGDRTl KVW 
STSTCEFVRTLNGHKRGIACLQYRDRLWSGSSDNTIRLMDIEC 
GACLRVLEGHEELVRCIRFDNKRIVSGAYDGKIKVWDLQAALDP 
RAPASfLCLRTLVEHSGRVFRLQFDEFQIISSSHDDTrLIWDFIi 
KVPPSAQNETRSPSRTYTYISR 

SGRVAMDGLGIiQFPEQGFPAGPPIiliPPHMGGHYRbCQSLGAPPL 



571 



3NS0OCI0 <W0 0153312A1_I..> 



wo 01/53312 



PCT/USOO/34263 





SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing sign'^ peptidg~~ 
(A=Alanine, C«Cysteine, D^Aspartic Acid, E- 
Glutamic Acid, F^Phenylalanine, G«Glycine, 
H^Hiatidine, i=Isoleucine, K^Lysine, 
Ii=beucine , M=Me t hionine , N^Asparagine , 
P=Proline, Q:=Glutamine, R«Arginine, 
S -Serine, T=Threonine, V^Valine, 
W^Tryptophan, Y=Tyrosine, X^Unknovm, *=Stcp 
Codon, /-possible nucleotide deletion, 
\»po33ible nucleotide insertion) ' 








DGYPLPTPDTSPLDGVDPDPAFFAAPMPGDCPAAGTYSYAQVSD 
YAGPPEPPAGPMHPRLGPEPAGPSIPGLIAPPSAI^HVYYGAMGS 
PGAGGGRGFQMQPQHQKQHQHQHHPPGPGQPTPPPBALPCRDGT 
DPSQPAELLGEVDRTEFEQYLHFVCKPEMGLPYQGHDSGVNr4PD 
SHGAI SS WS DASSAVY YCNYPDV 


6927 


2 


1484 


I>TLCGDIQLMl*AQNANWRAAHI.EEFHYQTKEDQEIUiSLHRESS 
CQGFAWATDLSTDLESQLSVSCKCYEAANKILQFRDLKSQMPEH 
YVQVIiKRMGWlRNEIGVFYMNQAAALQSERLVSKSVSAAEQQr^W 
KKSFSCFEKGlHNFESIEDATNTUOiLLCNTGRLMRlCAQAHCGA 
GDELICREFSPEEGLYYNKAIDYYLKAUlSLaTRDIHPAVWDSVN 
WELSTTYFTMATLQQDYAPLSRKAQEQIEKBVSEAMMKSLKYCD 
VDSVSARQPLCQYRAATIHHRLASMYHSCLRNQVGDEHLRKQHR 
VI*ADLHYSKAAKLFQLLKDAPCELI,RVQrjBRVAPAEFQMTSQNS 
KVGKiKTLSGAI>DrMVRTEHAFQLIQKELIEEFGQPKSGDAAAA 
ADASPSLNREEVMKLLSIFESRLSFLLliQSIKLI^STKKKTSNN 

lEDDTXLKTNKHIYSQLLRATANKTATMiERINVIVHLLGQIAA 
GSAASSNAVQ 


6928 


1086 


777 


EAXDLINNLLOVKMRKRYSVDKTI^HPWLQDYQTWLDiRELEClT" 

IGERYITHESDDLRWEKYAGEQGLQYPTHLINPSASHSDTPETE 

ETEMKALGERVSIL 


6929 


174 9 


6 07 


RDQRGYRDDRSPAREPGDVSARTRSGGGGGRSATTAMPPPVPNG 

NLHQHDPODLRHNGNVWAGRPSCSRGPRRAIQKPQPAGGRRSG 

RGPAAGGLCLQPPDGGTCVPEEPPVPPMDWEALEKHLAGLQPRE 

QEVRKQGQARTNSTS AQKNERESlRQKXAIiGS FFDDGPGI YTS C 

SKSGKPSLSSRLQSGMNl.QrCFVNDSGSDKDSDADDSKTETSLD 

TPLSPMSKQSSSYSDRDTTEEESESJUDDMDFLTRQiCKLQAEAKM 

ArJ\MAKPMAKMC^^"VKKQNRKKSPVADLLpHMPHISECLMKRSL'" 

KPTDLRD^r7;^;<3QLQ^^:VNI5LHSQIESIlNEELVQLLLXRDEI,^TE 

QDAMLVDIEDZiTRHAESQQKHMTVEKMPAK 


6930 


131 


545 


FKDTANVFVS L FQMRNNFRH YFI E PS QLKliFYO VI TWI VTQVTAI 

SYTWPFVLLSIKPSLTFYSSWYYCLHXLGILVLLLLPVKKTQR 

RKNTHENIQLSQSKKFDEGENSLGQNSFSTTNNVCNQNQEIASR 
HSSLKQ 




6931 


2 


6S9 


FVERLPNRPACLLVASGAAEGVSAQSFLHCFTMASTAFNLQVAt 

LIPSCPGAIiTDliASSGSLARILQHFHSESKPICAVGHGVAALCC 
ATNEDRSWVFDSYSLTGPSVCELVRAPGFARLPLWEDFVICDSG 
ACFSASEPDAVHWLDRHLVTGQNASSTVPAVQNLLFLCGSRK 




6932 
6933 


2 


1131 


fvdspgqgeqaeeeeggiqmnsrmrahspaegasvessspgpkk'" 

SDMCEGCRSLAAGHPGYISHDKETSIKYVSHQHPSHPQLFSIVR 

qacvrslscevcpgregpifpgdeqhgfvpshtffikdslargf 

QRWSIITXMMDRiyi,INSWPFIi£/3KVRGIIDEK!GKAX.KVFEA 
EQFGCPQRAQRMNTAFTPFIiHQRNGNAARSLTSLTSDDHIiWACL 
HTS FAWLLKACGS RLTEKLLEGAPTEDTI.VQMEKLADLEEESES 
WDNfSEAEEEEKAPVLPESTEGRELTQGPAESSSLSGCGSWQPRK 
LPVPKSLRHMRQVGGRGTAttHELRRRANHGLCI*PTRI*ASGPSTL 
KTLQEVTDSLLGGWLMAQGVGGI I 




6934 


1431 


890 


SLNLHCTIiPPPPHQYPAGYPSDKEGKKPKGQSKKQPSGTTKRPI 
SDDDCPSASKVYKASDSAEAIEAFQLTPQQQHLIRBDCQNQKLW 
DEVLSHLVEGPNFLKKLEQSFMCVCCQELVYQPVTTECPHNVCK 
DCLORSFKAQVFSCPACRHDLGQMYIMIPNEILQTLLDLPPPGy 
SKGR 


i 


... 


3030 


2S88 


bRDHSQCGGIRRVAIiARVSSViOilSKAKIRTVKMTPIIVIiAFIV ' 
CWTPFFFVQMWSVWDANAPKEASAPI IVMLLASIiNSCCNPWIYM 
LFTGHLFHEIiVQRFLCCSASYIjKGRRLGETSASKKSNSSSFVLS 
HRSSSQRSCSQPSTA 
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SEQ 
ID 
NO: 


predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D:=A3partic Acid, E= 
Glutamic Acid, F^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K-L/sxne, 
L=t>eucine, M«Methionine, N=Asparagine , 
P= Proline, Q=Glutamine, R=Arginine, ■ 
B~Seri ne , T=Threonine , v»Val ine , 
Wj^Tryptophan, YsTyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6935 


886 


S43 


NSALYVAGGNDGTSCLNSVERYSPKAGAWESVT^MNIRRST^DL ' 
VAMDGWIiYAVGGNDGSSSLNS lEKyNPRTNKWVAASCMFTRRSS 
VGVAVLELLNFP PPSS PTLS VSSTSL 


6936 


1347 


567 


RSHRRQFIiSRAI^IiEFFGKSHPPPHRLPRKSUIVGLHYSHilPFXiT "' 

TCLHFIiRKRLQKGEVGLSVETSKPQVPVGGLSRKKVPQEPWATV 

MEKRLQEAQLYKEEGNQRYREGKYRDAVSRYHRAI»LQLRGLDPS 

LPSPLPNl/SPQGPALTPEQENIIiHTTQTDCYNNIJVACIiLQMBPV 

NYERVREYSQKVLERQPDNAKALYRAGVAFFHUSDYDOARHYLL 

AAVNRQPKDAKVRRYLQLTQSELSSYHRKEKQLYLGMFG 


6937 


1 


727 


AVEFRCC PGRDPACFARGWRLDRVYGTCFCDQACRFTGDCCPi>Y ' 

DRACPARPCFVGEWSPWSGCADQCKPTTRVRRRSVQQEPQNGGA 

PCPPLEERAGCLEYSTPQGQDCGHTYVPAFrTTSAFKKERTRQA 

TSPHWSTHTEDAGYCMEFKTESLTPHCALENRPLTRWMQYUREG 

YTVCVDCQ P PAMNS VSIiRCSGDGUSSDGNQTLHWQAIGNPRCQG 

TWKKVRRVDQCSCPAVHSFIFl 


6938 


3 


719 


NSRiCLEIxAERVDTDFMQLKKRRQSSEKENDSGTLDTVGAVWDH"" 

EGNVAAAVSSGGLAI.KHPGRVGQAALYGCGCWAENTGAHNPYST 

AVSTSGCGEHLVRTII*ARECSHALQAEDAHQAI*LETMQNKFISS 

PFI*ASEDGVI»GGVIVL,RSCRCSAEPDSSQNKQTLLVEFI.WSHTT 

ESMCVGYMSAQDGKAKTHISRLPPGAVAGQSVAIEGGVCRLGEP 

SELTLQAECEASQRHFRT 


6939 


3 


810 


KVTAPRRPQRYSSGKGSDNSSVLSGELPPAMGRTALFHHSGGSS 
GYESIiRRDSEATGSASSAPDSMSESGAASPGARTRSLKSPKKRA 
TG10Riaa.IPAPLPDTTALGRKPSLPGQWVDLPPPLAGSIiKEPF 

eikvyeiddverlqrprptpreaptqgracvstrlriaerrqqr 
lrevqakhkhlceelaetcjgrlmlepgrwleqpevdpelepesa^ 
eylaalerataaz.eqcvnlckahvmmvtcfdisvaasaaipgpq"'*^ 

EVDV 


6940 


11B8 


4S6 


GKMAAQP LRHRSRCATPPRGDFCGGTERAIDQAS FTTSME WDTQ 
WKGSSPIiGFAGUSAEEPAAGPQIiPSMLQPERCAVFQCAQCHAV 
riADSVHLAWDLSRSLGAVVFSRVTNNVVLEAPFLVGIEGSLKGS 
TYNLLFCGSCGrPVGFHIiYSTHAALAALRGHFCLSSDKMVCYLI* 
KTKAIVNASEMDIQNVPLSkKIAELXEKlVLTHNRLKSLMKIIiS 
EVTPDQSKPEN 


6941 


1 


713 


SLSRADSDPHGPHTCGHVLNVIIGSNVIAIAEAQRQAEAI»GyQA 
VVLSAAKQGDVKSMAQFYGriLAHVARTRtiTPSMAGASVEEDAQI» 
HEIiAAEIiQ I PDLQLEEAIiBTMAWGRGPVCIiliAGGEPTVQLQGSG 
RGGRNOELALRVGAEIiRRWPIiGPIDVIjFLSGGTDGODGP'rEAAG 
AVfVT^EIiASOA A ARGTjH T ATFTiAHMn*^ HT RPPrr /Vin aUT .T .TTT^^ 
MTGTKVMDTHLLFLRPR 


6942 


X 


246 


GDYVERYDPKTDTWTMGAPr*SMPTWAVGGCI.IiGDRLYAr>GGYDG 
QTyiiNTMESYDPQTNEWTQMASI.NIGRAaACWVIKQP 


6943 


1 


739 


PMATGDGAKTlAIHNnCALTADSIRITWKATLPASSFRLSWLRLG ' 
HS PAGGS ITETLVCX3DKTEYLLTALEPKPTYI I CMVTMETTNAY 
VADETPVCAKAETADSYGPTTTLNQEQNAGPMAStiPLAGIIGGA 
VJUiVFLFLVLGAI CWYVHQAGEXiLTRERAYKRGgRKKDDYI-IESG 
TKKDNS ILEIRGPGLQMLPINPYRAKEBYWHTI PPSKGSSLCK 
ATHTIG YGTTRGYRDGGI PDIDYSYT 


6944 


960 ' 


156 


VANILLNGVKYESELTGSSERAEQPLSVGRLCSTICNMPKALRT 
LCVNHFIXSWLSFEGMLLFYTDFMGEWFQGDPKAPHTSEAYQKY 
NSGVTMGCWGMCIYAFSAAFYSAII»EKIjEEFLSVRTLYFIAYr*A 
FGLGTGrATIiSRNLYWLSLCITYGILFSTLCTLPYSLLCDYYQ 
SKKPAGSS APGTR?^GMGVDISr*LS CQYFIAQ ILVSLVLGPLTSA 
VGSANGVMYFSSliVSFLGCLYSSLFVIYEIPPSDAADEEHRPLL 
LNV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acid segment containing signal piptld^ 
(A=Alanine, C=Cysteine, D^^Aspartic AcidT E- 
Glutamic Acid, F»Phenylalanine, G-Glycine 
H=Histidine, I==Isoleucine, K=Lysine 
L^Leucine. M-Methionine. N=.AsparagiAe, 
P= Proline, O^Glut amine, R^Arginlne, 
S=Serine, T-Threonine, v=s valine ' 
W^Tryptophan, Y=Tyrosine, X=.UnkAown, *«stoo 
Codon, /^possible nucleotide deletion, 
\=poesible nucleotide insertion) C 


6946 


2067 


179 


EUEDRGLPRTMGAAbGTGTklAPWPGRAC^aALPkW-rPTAPAOGC " 

HSKPGPARPVPLKKRGYDVTRNPHLNKGMAFTLEERLQLGIHGL 

IPPCPLSQDVQLLRlMRYYERQQSDLDKYIIUirLQDRNEKLFY 

RVLTSDVEKFMPIVYTPTVGIiACQHYGLTFRRPRGLPITIHDKG 

HLATMLNSWPEDNIKAVWTDGERILGLGDU5CYGMGIPVGKLA 

LYTACGGVNPQQCLPVLLDVGTKNfiELIiRDPLYIGLKHQRVHGK 

AYDPLLDBFMQAVTDKFGINCTilQPEDFANANAFRLLNKYRWKY 

CMFNDDIQGTASVAVAGIIiAALRITKNKLSNHVFGFQGAGEAAM 

G\1AHLLVMALE\KEGVPKA\EATRKIW\MVDF\KGLIVQGRDH 

LNHEKEMFAQD\HPE VNSLEEWRLVKPTAl IGVAAIAEAVFTE 

QILRDMASFHER P\ 1 1 FALSNPTS KAECTA\EKCYRVTEGPRGF 

FAS\GSPF*GVLIWEMGKTFIPGGRGNNA*RVPRGWQLGVHSPG 

GDPGHIP\DEIFLPDSRAKLPQEVSEQHI,SQGRLVP\PLST\1R 

NVFLRIAIKVFD*GyKHNX*V\SYYPEPKD\KEAFCKIPGSYTPD 

YDS FYT/VDSY I WAQGKAMNVQTV 


6947 


133 


2551 


i^ui:; 1 :>UlTVAPGDPCPGVAHbIAPSMASDTPESLMAX.CTOFC^R 
NIiDGTI^yr*LDK2:TLRIJlPDlFLPSElVrnpT.VMPvui?T imMv/^ 

NF\EPHE\SFFWPLFRt>PRKQPASRRIHL\RED\LVQD\QD\LE 
AIRKQDL\VEL\YL'm\CBKLSAKSLQTI*RSFSHTLGVP*AFFG 
C\TNILLr.RKENPGGL/CEDEYXFNPTCQVLVKDpTFEGPSRLR 
F\LKLGRMIDWVPVES\LLRPr^SIAALDI>SGIQTSDAA\FI.TO 
WKDSL\VSLVL\YNMDLSDDniR\VIVQLHKLRHLDlSRDRLSS 
YYKFKLTREVLSLFVQKLGNLMSLDISG\HMILENCSISKIGKR 
EAGQTSI\EPSK\SSIIPFRGFEGGPLQF\liGVF*GIFCGRLTH 
IPAYKVSGDKNEEQVr,NAIEAYTEHRPEITSRAINLr.FDlARIE 
RCKQLLRALKIiVITALKCHKYDRNlQVTGSAALFYLTNSEYRSE 
QS\nfCLRRQVIQVVl,NGMESYQEVTVOIiWCCLTI,CWFSXPEELEi^' 
QYRRVNELLI^ILNPTRQDESIQRIAVHLCNALVCQVDNDHKEA 
VGKMGFWT^^iKLTQKKLLDKrCDQVMEFSWXSALWNITDETPD 
NCEMPLNFNGWKLFl^CIJtJEFPEKQELHRN>fiLGLLGNVAEVKEIi 
RPQLMTSQFISVFSNLLESKADGIEVSYNACGVLSHIMFDGPEA 
WGVCEPQREEVEERMVTAAIQSWDINSRRNXNYRSFEPILRLLPQ 
GISPVSQHWATWALYNIiVSVYPDKYCPLLIKEGGMPLLRDlIKM 
ATARQETKEMARKVIEHCSNFKEENMDTSR 


6948 


2 


1682 


Ti>Vi;TIPRGbAiJARPQSRSWRCCPVWRRSPGRARGRGLKMZ.NVP 
SQSFPAPRSQQRVASGGRSKVPIJCQGRSLMDWIRLTKSGKDLTG 
LKGRLIEVTEEELKKHNKKDDCMICIRGPVYNVSPYMEYHPGGE 
DEI^RAAGSDGTELFDQVHRWVNYESMXiKECLVGRMAIKPAVLK 
DYREEEKKVLNGMLPKSQVTDTrAKEGPSYPSYDWFQTDSX.VTI 
/EHI Y* TEGYQFRLNNS *SSE* FLYSRNNY*Gt.r.T *5 vww /p * a 

MRFRKIFLCGL/CESVGKIEIVLQKKENTSWDFLGHPLKNHNSIi 
rPRKDTGLYYRKCQLISKEDVlHDTRLFCLMLPPSTHLQVPIGQ 
HVYLKXPITGTEIVKPYTPVSGSLLSEFKEPVLPNNKYIYFLIK 
lYPTGLFTPELDRLQIGDFVSVSSPEGNFKISKFQEliBDLPLLA 
AGTGFTPMVKIIiNYAtTDlPSIiRKVKLMPFNKTEDDriWRSQIiE 
KLAPKDKRLDVEFVLSAPlSEWNGKQGHISPAlUIiSEFLKRNIiDK 
SKVLVCICGPVPFTEQGVRLbHDLNFSKNEIHSFTA 


6949 


104 


58 - " 


PDGAHSFFPDEVFTCSSLCLSCGVGCKKSMNHGKEGVPHEAKSR'" 

CKYSHQYDNRVyTCKACYERGEEVSWPKTSASTOSPWMGLAKy 

AWSGYVIECPNCGWYRSRQYWFGNQDPVDTWRTEIVHVWPGT 

DGFLKDNNNAAQRLUDGMNPMAQSVSEIiSLGPTKAVTSWLTDQI 

^PAYWRPNSQII^CNKCATSFKDNDTKHHCRACGEGFCDSCSSK 

rRPVPERGWGPAPVRVCDKCYEAR/TRPVSCYRGTSGR^RRRRT 

3ETVE 




152 


4656 " ( 
< 


JbRI^CLSRPLTRPGDDSVGGSAimSGAGGVGGGGGGKIRfRRCH " 
3GPIKPYQQGRC!0HQGrLSRVTESVKNIVPGWLQRYFNKNEDVC 
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SEQ 
ID 
NO: 


Predicced 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C«Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
Ij=Leucine, M=Methionine, N-Asparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, T==Threonine, V=Valine, 
"^Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /=possible nucleotide deletion, ^ 
\*posgible nucleotide insertion) 








SCSTDTSEVPRWPENKEDHLVYADEESSNlTDGRITPEPAVSm' 
EEPSTTSTASTVyPDVIiTRVSLYRSHLNFSMLESPALHCQPSTS 
SAFPIGSSGFSLVKEIKDSTSQH0DDNISTTSGPSSRASDKDIT 
VSKNTSLPPLWSPEAERSHSLSQHTATSSKKPAFNLSAFGTLSP 
SLGNSSILKTSQLGDSPPVPGKTTYGGAAAAVRQSmRNTPYQA 
PVRRQMKAKQLSAQSYGVTSSTARRrLQSLEKMSSPLADAICRIP 
SIVSSPLNSPIiDRSGlDITDFQAKREKVDSQYPPVQRLMTPKPV 
SIATNRSVYFKPSr,TPSGEFRKTNQRIDKKCSTGYEKNMTPGQH 
REQRESGFSYPNFSLPAANGLSSGVGGGGGKMRRERHAFVASKP 
LEEEEr^GPVLPBaSLPITSSSLPTFNFSSPEITTSSPSPIMSS 
QALTNKVQMTSPSSTGSPMFKFSSPIVKSTEANVLPPSSIGPTP 
SVPVAKTAELSGSSSTLEP I ISSSAHHVTTVNSTNCKKTPPBDC 
EGPFRPAEILKEGSVLDILKSPGPASPKIDSVAAQPTATSPWY 
TRPAISSFSSSGlGFGESLKAGSSMQCDTCLt^NKVTDNKCXAC 
QAAKLS PRDTAKQTG lETPNKSGKTTLSASGTGFGDKFKPVlGT 
WDCn3TCLVQNKPEAIKCVACEXPKPGTCVKRALTLT\A^SESAET 
MTASSSSCTVTTGTLGFGDKFKRPIGSWECSVCCVSNNAEDNKC 
VSCMSEKPGSSVPTSSSSTVPVSLPSGGSIjGLEKPKKPEGIWDC 
BLCIiVQNKADSTKCrJ\CBSAKPGTKSGFKGET>TSSSSSNSAASS 
SFKFGVSSSSSGPSQTLTSTGNFKFG0QGGFKIGVSSDSGYINP 

msegf*fskhivgfkfgvsseskpeevkkoskndnfktolsfgl 
snpvfltpfqfgvsnixsqeekkeei^ksscagfrfgtgvinstr 

VPANTl VTSENKSS FNLGTI ETKSVSVAPUCCQTSEAKKEEMPA 
TKGGFSFGNVEPASLPSASVFVIiGRTEEKQQEPVTSTSLVFGEG 

kltmkepkcXqpvfsfgefqrqtkdensskstfsfsmtkpseke 

SEQPAKATFAFGAQTNTTADQGAAKPDIiSYLNNSSSSSSTPATa 

agggVxfgsstssswppvatfvfgqssnpgsssXafgntaesst'' 

SQSLLPSQPSPATTSSTGTAVTPFVFGPGASSNNTTTSGFGFG 

atttsssagssfvfgtgpsapsaspapgahqtptfgqsqgasqp 

KfPPGFGSISSSTAl^FPTGSQPAPPrFGTVSSSSQPPVFGQQPSQ 

safgsgttpnsssafqfgssttnfnftnnspsgvftfgansstp 
aasaqpsgsggppfnqspaaftvgsngknvfsssgtsfsgrkik 

TAVRRRK 


6950 


2585 


411 


prpgsrsglcrragergavragglsrrtrae* imdelhyqdtds ■ 

DVPEQRDSKCKVKWTHEEDEQLRALVRQFGQQDWKFLASHPPNR 
rOOQCQYRWLRVLNPDLVKGPWTKBBDQKVlEIiVKKYGTKQWTL 

iakhlkgrlgkqcrerwhnhlnpevkkscwteeedri iceahkv 
lgnrw;veiakmlpgrtdnavknhwnstikrkvdtggf1*sbskdc 
kppvyxiiileiiedkdglqsaqptegcjgslltnwpsvpptikeeen 

SEEEriAAATTSKEOEPTf?TriT.TiavpfrTJi?DT iri;;«t?DirDCTM^t:»/Nf'Ttn 
ETSLPYK>nn*rEyu^I,LIPAVGSSLSEArj3LIESDPDAMCDLSKF 
DLPEEPSAEDSINNSLVQLOASHQQQVLPPRQPSAXLVPSVTEY 
RliDGHTISDIiSRSSRGELIPISPSTEVGGSGIGTPPSVIiKRQRK 
RRVALSPVTENSTSLSFLDSCKSLTPKSTPVICrLPFSPSQFriNF 
PINKQDTLELESPSLTSTPVCSQKWVTTPLHRDKrPLHQKHAAF 
VTPDQKYSMDNTPHTPTPFKNALEKYGPLKPLPQTPHLEEDLKE 
VLRSEAGIELIIEDDIRPEKQKRKPGtRRSPIKKVRKSlALDIV 
DEDMKLMMSTLPKSLSLPTTAPSNSSSLTLSGIKEDNSLLNQGF 
I^AKPEKAAVAQKPRSHFTrPAPMSSAWKTVACGGTRDQLFMQE 
KARQLI»GRLKPSHTSRTLILS 


6951 " 


1940 


239 


AGPDDTMKRSliQALYCQLLSFLLILALTEAiAFAIQEPSPRESL" 

QVLPSGTPPGTMVTAPHSSTRHTSWMLTPNPDGPPSQAAAPMA 

TPTPRAEGHPPT\TPSPPSLRQ*PPPILKAP/SSTGPAPAAMAT 

TSSKPEGRPRGQAAPTILbTKPPGATSRPTTAPPRTTTRRPPRP 

PGSSRKGAGNSSRPVPPAPGGHSRSKEGQRGRNPSSTPLGQFCRP 

LGKIFQIYKGWFTGSVEPEPSTLTPRTPLWGYSSSPQPOTVAAT 
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SEQ 
ID 
NO: 



Predicted ' 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 

nucleotide 
' location 
I corresponding 

to first 
[ amino acid 

residue of 
I amino acid 

sequence 



6952 



"6553" 



6B8 



304 



1512 



6354 



819 



6 9SS 



1968 



782 



"69S6 



8605 



3839 



Amino acxd segment "^5Htidin:Hi~^I5H^^^ 
il^ta'"'"^ .^-<^-teine, D.Aspartic Acidf^ 
Glutamic Acxd, F=Phenylalanine, G=Glyci;e 
H=Hxstxdine, I=Isoleucine, K=Lysine 
L=Leucxne, M^Methionine, N=Asparaciine 
P^Proline, Q=Glutan,ine. R=ArginSf^ - 
S=serxne, T«Threonine. V.Valine, 
CodS^'/^^^""'.!:^^^"^^^' X^Unknown, -stop 
x t^n-C^^""^^^^^^ nucleotide deleti;n. ! 
^poaoible nucleoti de insertion) - 

SRPLSTSSGVFTAATGPTPMFDTSVSJVPQnaTT>i^;^!}5^ 



EVLNMESLPTVHNEGPSSAEGKDiAPSPPWPAG^VCN^SJ 

etpeehevrrmrdreakrlqrmqetde^rl^re^rSp 

MAA EMNFFQI.PVSGVELDSQI,I/3KM AFRRnM«.,"fg°^^ 

PgPPFXU'oUfKi:AUi'AU>KHaGDS bl^PPVKU.A.TRAAAQN - 

*PQR*RWTEGNSPQASAVATPGQGASPflAPRerp.psRRHR^p 

PGARPPAG.AAPAPTKPWLAGPASAPQPGftAPLSPPA^L?^ 

♦c^aaargrprrdrsprprtpggcsSse^ppSS 



»o.n.^'^*°°^°'*^=^°'^*P'^«'««5RSHRRBGTIPGHPHPR 



ERHGYCTLGEAFNRLDPSSAIQDIRHFNyWKLLQLIAKSOLTS 
GDTI-HFCRHCSIl.F»KDSaHPCTAADPDSCFTpi:sTOHFl^LFK 




^^„„™'^''°^^^^^^SRNVVHSVRREHFSFSPRMPVGDKF 

eerdtpegi^wvqi^aeeipsriqaitgkrgrprntStk^v 

5f:?2^''''^^^'^Q«QO"^LEE«KKPrEDMcSSpDS 



LCQGDSWEVQDLLVRliKAAloIDPGFPSYCQSLKILGEKVSEI 

^™^''°^^''""^'°™"«syrkhkwiv;gi^^Sct 
vlakrtgrsevemegpeeclgrrrssrimevtsgmeeeeeeesi 

AAVPGRRGRRDGEVDATASSIPEI.ERQIEKLSKROT.p™^Sf^ 
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SEQ 
ID 
NO: 



beginning 
nucleobide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



"Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 




Amincr~acid 5egment~^ntaining signal p&tth^ H^ 
(A=Alanine, C.Cyateine, D. Assart fc^^lf^ 
« u ^=^henylalanine. G^^GlyciAe 

H=Hxstidine, I=Isoleucine, K-Lysine 
L^Leucine, M=Methionine, N=AsparaoiAe 
P=Proline, Q=Gluuamine, R=Arginin^, ' 
St^Serine, T=Threonine, V=VaIine, 
W=Tryptophan, Y=.Tyrosxne, X-Unknown, *^&tot> 
Codon, /=po3Gible nucleotide deletion, ^ 
N'^pogsxble nucleotide insertion) - 

LQLQSHKGFLEQEGSPLSLGQSQHDI^QSAFr^WI^OTnaMoo? 



K.L.lVAMPEPTKXKEMBVPAPA P^PKKPSKEKEAgTTPAiaiW Ti 
ETI.PGEECAXQHANSQI,SILPIEKPQ(MWKvSEDlTPri5v 
EDLSEKPTINGSRKWMDIASKAaKHLQBKETFERHSRVYTFEN 

SAFKRSGEGQEDAGEUJPSGIiKRKEVKQQEEEPoSvWEZw 

VyQVDKGGRVRFWEUVDPKLEVKWNKNGQELRPSTKYIFEM 
CQSIIJTIDNCQMTDDSEYVVTAGDEKCSTELLTOEPM^KC 
EDTTDYa^RVELECEVSEDDAQVKWPKNGEEIILVOTRYllR 

egkkhilii^gatkadaadysvmttggqssaklsvdSp 
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SEQ 
ID 
NO; 


Predicted 

t»eginning 

nucleotide 

location 

c orre spond i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo aca.d segment containing signal peptide 
(A=Alanine, C=^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F- Phenyl alanine, G=Glycine, 
H=His tidine , I ^ X soleucine , K=bysine , 

Leucine, M-Methionine, N=Asparagine, 
P==Proli.ne, Q=Glutamine, R=!Arginine, 
S=SerJLne, T=:Threonine, V«Valine, ' 
W=Tryptophan, Y^Tyrosine, X«Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) C 








LQPQTPGI^QSSHLSLLSSRDYRMLSSFWEWFWQDRFWLPPNVT' 
WTEIJEDRDGRVYPHPQDLLAALPIJVLVLLAMRLAFERFIGLPLS 
RWLGVRriOTRRQVKPNAlXEKHFLTEGHRPKEPQLSi.IAAQCGL 
TLQQTQRV7FRRRRNQDRPQLTKKFCEASWRFLPYLSSFVGGLSV 
LYHESWLVJAPVMCWDRYPNQLTLSCPAABSEAXSLYWWniLEIiQ 
FYLSLLIRLPFDVKRKGGGPSSIKPRPHYDPPSTA\DFKEQVIH 
HF VAVI LMTFS YSANLIiRIGSI>VIjIiIjH0S S D YLLEACJQ'I VNYMQ 
YQQVCDTajFIiIFSFVFPYTRLVLFPXQILYTTYYESISNRGPFF 
GYYFFNGLLMLLQLLHVFWSCLILRMIjYSFMKKGQMEKDIRSDV 
EESDSSEEAAAAQEPI*QLKKGTAGGPRPAPTDGPRSRVAGRLTN 
RHTTAT 


6960 


387 


2068 


akwarekemqef\trsff\rgrpdlstlthsivrrryiahsgrs 

HLEPEEKQALKRLVEEEPLKMQVDEAASREDKJCjDLTKKQKRPPT 

pcsdperkrfrpnsesbsgseasspdyfgppakngvasrshthp 
keewprra\skaveessdeerqrdlpaqrgeesseeeekgykgk 
trkkpwkkqapgkasvsrkqareeseeseaepvqrtakkvegn, 

PRTRSNGRRKSAREERSCKQKSQAKRLIiGDSDSEEEQKEAASSG 
DUSGRDREPPVQRKSEDRTQLKGGKRLSGSSEDEEDSGKGEPTA 
KGSRKMARLGSTSGEESDLEREVSDSEAGGGPQGERKNRSSKKS 
SRKGRTRSSSSSSDGSPEAKGGKAGSGRRGEDHPAVMRLKRYIR 
ACGAlIRNYKKI.I/3SCCSHKERI.SII»RAEIiEALGMKGTPSLGKCR 
ALKEQREEAAEVASLDVAWIISGSGRPRRRTAWNPLGEAAPPGE 
LYRRTUDSDEERPRPAPPDWSHMRGIISSDGESN 


6361 


340 


1646 


RPWSSPTMKPNFSLRLRIFNLNCWGIPyiiSKHRADRMRRUSDFL 
NQESPDrj:a*LEEVWSEQDFQyLRQKLSPTYPAAHHFRSGlIGSG 
LCVFSKHPIQELTQHIYTLNGYPYMIHHGDWFSGKAVGLLVIiH^^.-, 

sgmvlnayvthi^eynrqkdiyiahrvaqaweiaqfihhtskk 
adwllcgdlnmhpedlgccllkewtglhdayletrdfkgseeg 
ntmvpkncyvs qqelkp fpfgvrid wlykavsg fyiscks fet 
ttgfdphrgtpi*sdhealhatlfvrhsppoqnpssthgp\aers 

PL/MCV-CLKEAIjIXSSXiGLGMANQARWWAXTFAXsyVIGLGLVrj:. 
LALLCVIiAAGGGAGEAAI liLWTPS VQLVLWAGAFYLFHVQEVNG 
tiYRAQAELQHVLGRAREAODLGPEPQLYALLXLGQQEGDRTKEQ 


6962 


340 


1646 


RPWSSPTMKPNFSLRLRIFNIiNCWGIPYLSKHRADRMRRLGDFL 
NQES PDIiALIiEEVWSEQDPQYIiRQKLSPTYPAAHHPRSGI IGSG 
LCVFS KHP IQELTQHIYTLNGYPYMIHHGDWFSGKAVGLIiVXiHL 
SGMVLNAYVTHLHAEYKRQKDXYLAHRVAQAWELAQFIHHTSKK 
ADWLLCGDLNMHPEDIiGCCOliIiKHWTGIiHDAYLETRDFKGSEEG 
NTMVPKKCYVSQQELKPFPFGVRIDYVLYKAVSGPYISCKSFET 
TTGFDPHRGTPLSDHEAIiMATLFVRHSPPQQNP£STHGP\AERS 
PL/MCVCLKEALI)GSLGLGMA\QARWWA\TPA\SyVIGLGL\I*t, 
LAI,LCVIjAAGGGAGEAAILLWTPSVGLVLWAGAFyijFHVQEVNG 
LYRAQAELQHVIX5RAREAQDLGPEPQI.YAIi]:*\IiGQQEGDRTKEQ 


6963 


3 74 


2618 


RVJPLUjKLLKKPKTAENQKASEENEITQPGGSSAKPGLPCLNF 
EAVLSPDPTILIHSTHSLTNSHAHTGSSDCDISCKGMTERIHSIN 
I^FSNSVLETI^NEQRNRGHFCDVWRIHGSMLRAQRCVIiAAGS 
PFFQDKXJuLGYSDIEIPSWSVQSVQKLIDFMYSGVLRVSQSEA 
LQILTAASILQIKTVIDECTRIVSQNVGDVFPGIQDSGQDTPRG 
TPESGTSGQSSDTESGYLQSHPQHSVDRIYSALYACSMQNGSGE 
RSFYSGAWSHHETALGLPRDHHMEDPSWITRIHERSQQMERYI. 

sttpetthcrkqprpvriqtlvgnihikqemeddydyygqqrvq 
iiierkeseectedtdqaegtesepkgesfdsgvsssigtepdsv 
eqqfgpgaardsqaeptqpeqaaeapaeggpqtnqletgasspe 

RSNEVEMDSrVITVSNSSDKSVLOQPSVNTSIGQPLPSTQLVLR 

qtetltsnlrmpltltsntqvxgtagntylpalfttqpagsgpk 



1 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid Ef*cftnf»n h r'f^r^ ^ _ ■ ■" z, ^ 4—- ;— 

<A^4.fj. tocymenc concdimng sxgnai peptide 

(A=Alanine, C«Cyateine, D^Aspartxc Acid, E= 

Glutamic Acid, P=Phenylalanine. G-Glycine, 

H=Histldine, I =rso leucine, K-Lysine, 

L^Leucine, M^Methionine, K=Asparagine, 

psproline, Q-Glutamine, RisArginine ' 

S=£erine, ToThreonine, Valine, 

W-Tryptophan, Y-Tyrosine, X-Unknown, *«tStop 

Codon, /=po33ible nucleotide deletion, 

\=possible nucleotide insertion) ^ 








PFLFSLPQPLAGQQTQFVTVSQPGLSTFTAQLPAPQPLASSAGH 
STASGQGEKKPYECTLCNKTFTAKQNYVKHMFVHTGEKPHQCSI 
CWRSFSLKDYLIKVHrWTHTGVRAYQCSICNKRFTQKSSLNVHM 
pi.unK.\jrc.p^ I iiv, I Ai-ft.iU\.t oHitx jjJLtRHVALHSASNGTPPAGTPP 
GARAGPPGWACTEGTTYVCSVCPAKFDQIEQFNDHMRMHVSDCS 


6964 


1 


178 • 


SGRPFFFFFSNTDVYFXKKVTNRWTAGSSVKMTRMKSiGKXLLL 
QIFIGXKCSMFVIiVI 


6 965 


7S7 


208 


NVFI EPRIQGFMKTSAHPGQKHPDFSMGHjFPLLAALEVCSCGS 
SGSLGYNLPQNHXGLLGRNTLVLLGQMRRISPFLCLKDliSDFRF 

pqeicvevsqlqka\qamsflydvlqqvfnfshkall\ccmehdl 

PGPTPHFTSSAAGTPGDLLGAGDGRRRSWGQWVIEGSTIiALRRY 
FQESISTLE 


6966 


820 


1867 


XITALGVRGMPGCPCPGCGMAGPRI.LFLTAiALELIiGRAGGSQP 

alrsrgtatacrldnkeseswgallsgerldtwicsllgslmvg 

LSGVFPL,r>VIPLEMGTMLRSEAGAWRI,KQLLSFALGGt.LGNVFL 
HLLPEAWAYTCSASPGGEGQSLCXX^QQUSLWVIAGILTPLALEK 
/HVPGQQGGGDQPGPQQRPHCCCRRAQWRPLSGPAGC31ARPRCR 
GP\DIKVSGYLMLLANTIDNFraGLAVAASPLVSKKIGLLTTMA 
ILI.1IE I PHEVGDFAI LLRAGFDRWSAAKLQLSTALGGLLGAGPA 
ICTQSPKGVEETAAWVLPFTSGQFLYIALVNVLPDU^EEEDPW 


6967 


162 


633 


GFI*PFKYWIliDI*SASSRMETDCNPMELSSMSGFEEGSELNGF^G 
TDMKDMRLEAEAWNDVIiFAVNNMFVSKSLRCADDVAYINVETK 
ERKRYCLELTEAGLKWGYAFDQVDDHLQTPYHETVYSLLDTLV 
SPAYREAFGKR\LLORtiEAI.KRDGQS 


6968 


1 


22GS 


RGGGGGRGGPGARERERPGEPBRTMEAAAGGRGCFQPHPGLQKT 

LEQFHLSSMSSLGGPAAPSARWAQEAYKKESAKEAGAAAVPAPV. 

PAATEPPPVLHIiPAlQPPPPVLPGPFFMPSDRSTERCETVItEGE*'' 

TISCFWGGEKRbCLPOIIiNSVLRDFSLQaiNAVCDELHIYCSR 

CTADQLEILKVMGILPFSAPSCGLlTKTDAERLCNAtiLYGGAYP 

PPCKKEIAASLALGLELSERSVRVyHE\CFGKCKGL\LVPELYS 

SPSAACIQCLD\CRLMYPPHKFVVHSHKALENRTCHWGF\DSA\ 

NWRAYILLSQDYTGKEEQARLGR\CLDDVKEKFDYGNKYKRRVP 

RVSSEPPASIRPKTDDTSSQSPAPSEKDKPSSWLRTLAGSSNKS 

LGCVHPRQRLSAFRPWSPAVSASEKEIiSPHLPALlRDSFYSYKS 

FETAVAPNVALAPPAQQKWSSPPCAAAVSRAPEPLATCTQPRK 

RKLTVDTPGAPETLAPVAAPEEDIOJSEAHVBVESREEFTSSLSS 

LSS PS PTSSSS AKDLGSPGARALPSAVP£)AAAPADAPSGLEAEri 

EHLRQALEGGIiDTKEAKEKFLHEWKMRVKQEEKLSAALQAKRS 

LHQELEFLRVAZKEKrj^EATEAKRNliRKEIERLRAENEKKMBCEA 

NESRLULKRELBQARQARVCDKGCEAGRLRAKYSAQIEDLQVKL 

QHAEADREQLRADIiLREREAREHI»EK\VVK\ELQEQLWPRARPE 

AAGSEQ\AABtiEP 


6969 


1855 


118 


agtmhgrlkvktseeqaeakrlereqklklyqsXtqavfqkrqa"" 
geiidbsvleltsqilganpdfatlwncrrevlqqletqkspeel 
aalvkaeixsplesclrvnpksygtv-mhrcwllgrlpepnvjtrei, 
elcarplhvdernfhcwdyrrpvatqaavppaeeiaptdslitr 

NPSNYSSMHYRSCLLPQLHPQPDSGPQGRLPEDVLLKELELVQN 
AFFTDPNDQSAWFYHRWLLGRADPQDAtiRCLHVSRDEACLTVSF 
SRPLLVGSRMEILLU4VDDSPLIVEWRTPDGRNRPSHVWLCDLP 
AASLNDQLPQHTFRVIWTAGDVQKECVI^LKGRQEGWCRDSTTDE 
QLFRCELSVEKSTVriQSEI.ESCKEU2EliEPEl!nCWCL\LTlILLM 
RALDPLLYEKETLQYFQTLK\AWDPKRATY\IiDDLRSKFLLENS 
VLKMEVABVRVU^oAHKDLTVr^LEQLLLVTHLDLSHNRLRTL 
PPAI»AALRCLEDPPPRT\VLQASDNAIESLDGVTNliPRLQEHjL 
CNNRLQQPAVLQPLiASCPRIiVIiiNLQGNPLCQAVGILEQIiAELrj 
PSVSSVLT 
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SEQ 
ID 
NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, B- 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=:Hiscidine. l=lsoleucine, K=Lysine, 
ij~}at:\^i^^iifs f ri— neunionme , «— Asparagine < 
PstProline, Q^Glutamine. R=Arginine, 
S=Serine, T=:Threonine^ VWaline, 
W^Tryptophan, Y=.Tyrosine, X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) > 


6570 


3 


1528 


SFPPIiliSSPSAVGEGKVAVAAPCPGRSECARAKMAYIQIiEPLNE ' 

GFLSRISGLLLCRWTCRHCCQKCYESSCCQSSEOBVEILGPFPA 

QTP P WLMAS RSS DKDGDS VHTAS EVPLTPRTNSPDGRRSSSDTS 

KSTYSLTRRISSLESRRPSSPLIDIKPIEFGVLSAKKEPIQPSV 

LRRTYNPDDYFRKFEPHLYSZiDSNSDDVDSLTDEBIIiSKYQriGM 

LHPSTQYDLIjHNHLTVRVIEARDLPPPISHDGSRQDMAHSNPYV 

KICLL.PDQKNSKQTGVKRKTQKPVFEERYTFEIPFI1EAQRRTLI1 

LTWDFDKFSRHCVIGKVSVPLCEVDLVKGGHWWKALIPSSQNE 

VELGELLLSIiNYLPSAGRIiNVDVXRAKQLLQTDVSQGSDPPVKI 

QLVHGLKLVKTKKTS FLRGTIDP FYNESPSPKVPQEELENASLV 

FTVFGHNM1CSSNDFIGRIVTG\QYSSGP\SEPNHWRRMUJTHRT 

AVEOWHSI*RSRAECDRVSPASLEVT 


6971 


37 


3702 


ACFYVPGSRSFKlilPRHGLVNMGRSGKLPSGVSAKIiKRWKKGHS 
SDSNPAICRHRQAARSRFFSRPSGRSDLTVDAVKLHNELQSGSL 
RLGKSEAPETPMEEEAELVLTEKSSGTFI*SGIjSDCTNVTPSKVQ 
RFWESNSAAHKEICAVLAAVTEVIRSQGGKETETEYFAAIilRKA 
AQHGVCSVLKGSEFMFEKAPAHHPAAISTAKFCIQEIEKSGGSK 
EATTTIJIMLTLLKDIJCiPCFPEGLVKSCSETLIJ^VMTLSHVLVTA 

camqafhslfharpglstlsaelnaqiitalydyvpsendlqpi, 
iawlkvmekahini.vrlqwdlgiighijprffgtavtciii»sphsqv 
ltaatqslkeilkecvaphmadigsvtssasgpaqsvakmfrav 

EEGIiTYKFHAAWSSVLQLLCVFFEACGROAHPVMRKCLQSLCDL 
RLS PHFPHTAALDOAVGAAVTSMGPE WLQAVPLEI DGSEETTiD 
FPRSWIjLPVIRDHVQETRLGFFTTYFLPIANTLKSKAMDLAQAG 
SXVESKIYDTLQWQMWTLLPGFCTRPTDVAISFKGLARTLGM/VI 
SERPDLRVTVCQALRTLITKGCQAEADRAEVSRFAKNPLPILFIJ 
LYGQPVAAGDTPAPRRAVLETIRTYLTIXDTQLVNSIiLEKASER 
VIiDPASSDFTRLSVX.DLWALAPCADEAAISKLYSTIRPYLESK 
AHGVQKKAYRVLEEVCASPQGPGALPVQSHLEDLKKXLIiDSLRS 
rSSPAKRPRLKCIiLHIVRKLSAEHKEFITALXPEVILCTKEVSV 
GARKNAFAIaLVEMGHAFLRFGSNQEEALQCYLVLIYPGLVGAVT 
MVSCS I LAIiTHLbFEFKGLMGTSTVEQLLENVCLLUVSRTRDW 
KSAliGPT FCVAVTVMDUAHT.n tftn/nT.VMTTt T/^VT CT^TMinnvcntutv 

LRNLFTXKFIPKVFGILTHGKKAVGPKEYHRVLVNIRKAEARAK 
RHRAIiSQAAVEEEEEEEEEEEPAQGKGDSIEEILADSEDEEDNE 
EEERSRiGKEQRKLARQRSRAWLKEGGGDEPUIFLDPKVAQRVXiA 
TQPGPGRGRKKDHSFKVSADGRLIIREEADGNKMEEEBGAKGED 
EEMADPMEDVIIRNKKHQKLKHQKEAEEEELEIPPQYQAGGSGI 
HRPVAKKAMPGAEYKAKKAKGDVKKKGRPDPYAYIPIiNRSKWJR 
RKKMKLQGOFKGLVKAAQRGSQVGHKWRRKDRRP 


6972 


2179 


973 


PGGAII.LPLWRRTRPREATVPRGAAQRGRARSAEGRI PSSQS PS 
PAEAGGATRSPPPRPPRP7VRPPGPSAPPLLRSDAGPGATV5AAA 
AAATERARRGATMGAQLSTLGHMVLPPVWFIiYSLLMKLPQRSTP 
AITLESPDIKYPLRLIDREIISHDTRRFRFALPSPQHrLGLPVG 
QHIYLSARXDGNLWRPYTPISSDDDKGFVDLVIKVYFKDTHPK 
FPAGGKMSQYLESMQIGDTIEFRGPSGLLVYQGKGKFAIRPDKK 
SNPIlRTVKSVGMIAGGTGITPMIiQVIRAIMKDPDDHTVCHLLF 
ANOTEKDHiLRPELEELRNKHSARFKLWyTIiDRAPEAMDYGQG \ 
FVNEEMIRDHLPPPE\EEPLVLMCGPPPMIQYACLPNXi\DHVGH 
PTERCFVF 


6973 


1 


1964 


LQPRCAHRGLRAQKCGRPAPGVDAMVLCPVIGKIiIiHKRVVLASA 
SPRRQEXLSNAGLRFEWPSKFKEKLDKASFATPYGYAMETAKQ 
KAIiEVANRIiYQKDLRAPDWlGADTIVTVGGLILEKPVDKQDAY 
RMLSRPE/SGREHSVPTGVAIVHCSSKDHQLDTRVSEFYEETKV 
KFSELSEELLWEYVHSGEPMDKAGGYGIQALGGMLVESVHGDFL 
NWGFPLNHFCKQLVKLYYPPRPEDIiRRSVKHDSIPAADTFBDL 



4 

580 



wo 01/53312 



PCT/USOO/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
CO first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acxd segment containing signal peptide 
(A-Alanine, C= Cysteine, D=Aspartic Acid, . 
Glutamic Acid, F= Phenyl alanine, G^Glycine, 
H=Histidine, I=Isoleucine, K:=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glutamine, R=^Arginine, 
SsSerine, T=Threonine, V=Valine, 
W^Tryptophan, Y=Tyroeine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, , 
\=possible nucleotide insertion) ^ 








SDVEGGGSEPTQRDAGSRDEKAEAGEAGQATAEAECHRTREtLP" 

PFPTRLLELIEGFMLSKGLIiTACKLKVFDLLKDEAPQKAADIAS 

KVDASACGMERLLDICAAMGLLEKTEQGYSNTETANVYLASDGE 

YSIilGFIMHNNDLTWNLPTYLEFAIREGTMQHHRAI/SKKAEDLF 

QDAYYQSPETRLRFMRAMHGMTKLTACQVATAFNLSRPSSACDV 

GGCTGALARELAREYPRMQVTVFDLPDIIEUVAHFQPPGPQAVQ 

IHFAAGDFFRDPLPSAELYVLCRIIiHDWPDDKVHKLLSRVAESC 

KPGAGLLLVETLUr>EEKRVAQRAI>MQSI*MMLVOTEGKERSLGEY 

QCLLBLHGFHQVQVVHLGGVLDAIL\PPKWPPEAQAACSr, 


6974 


3 082 


2172 


RSCAAFASfASRPPLELFAPPGSHRSPPGRGVATSAQCALSVRK 
LLAARPGLGTKYQATMVYKTLFALCILTAGWRVQSLPTSAPLSV 
SLPTMIVPPTTIMTSSPQNTDADTASPSNGTHNNSVLPVTASAP 
TSLLPKNISIESREEEITSPOSNWEGTNTDPSPSGFSSTSGGVH 
IiTTTLEEHSLGTPEAGVAATLSQSAAEPPTLISPOAPASSPSSL 
STSPPEVFSASVTTNHSSTVTSTQPTGAPTAPESPTEESSSDHT 
PTSHATAEPVPQEKTPPTTVSGKVMCEI,IDMET\PPPPPG 


6975 


2 


soo 


RPRPTVHCCK^JALKLETJU^ETI^INVFHAHSGKEGDKYKLSKldBL 
KELLQTELSGFLDVKELML* ATEALKTFEEA* KSPI IQCSSSRS 

slppapqpppyl*lsavpfp:hlpl.pllppqaqkdvdavdkvmk 

ELDEKGDGEVDFQE Y WLVAALTVACNNFFWENS 


6976 


1216 


970 


GCQL*VAYGTTENSPVTFAHPPEDTVEQKAESVGRIMPHTEARI 

mMEAGTIAKliNTPGEIiCIRGYCVMUSYWGEPQKTEEAVDQDKW 

YWTGDVATMNEQGFCKIVGRSKDMIIRGGENIYPAELEDFFHTH 

PKVQEVQWGVKDDRMGEEICACIRLKDGEETTVEEIKAFCKGK 

ISHFKlPKYIVFVTNYPLTISGKlQKFKLREQMERHIiNL*IKQO 
ACPGRIA ' 


€977 


1298 


sea 


SIiFINTNI>L.SNQIRKTSFGMCSEPISDNTEDQKGiaiKTPDFA*R^ 

ANKKSKHHVNGNRTVEPFPEGTQMAVFGMGCFWGJ\ERKFWVIiKG 

VYSTQVGFAGGYTSNPTYKEVCSEKTGHAEWRWYQPEHMSFE 

ELLKVFWENHDPTQGMRQGNDHGTQYRSAIYPTSAKQMEAALSS 

KENYQKVLSEHGPGPITTDIREGQTFYYAEDYHQQYLSKNPNGY 

CGIX3GTGVSCPVGIKK 


6978 


3 


242 


SFPFRDSRRCGCCKGSSLRHTAVAMVtCLSKEAKQRLQQLFKGSQ 
FAIRWGFIPLVIYLGFKRGADPGMPEPTVl^SLLWG 


6979 
6980 


3917 
X 


1146 
420 


DEARVRGEAVAAAII,SRCRHWSGPPPFPPSPPDRKGLRGTEPWE" 

AGPGSGATPGARAMDVRRLKVNELREELQRRGLDTRGriKTELAE 

RLQAAI^EAEEPDDERELDADDEPGRPGHIKEEVETEGGSELEGT 

AOPPPPGLQPHAEPGGYSGPDGHyAMDNXTRQNQFYDXQVIKQE 

NESGYERRPLEMEQQQAYRPEMKTEMKQGAPTSFLPPEASQUCP 

DRQQPQSRKUPYEENRGRGYFEHREDRRGRSPQPPAEEDEDDFD 

DTIiVAIDTYNCDl.HFKVARDRSSGyPLTIEGFAYIiWSGARASYG 

VRRGRVCFEMKINEEISVKHtiPSTEPDPHWRIGWSUDSCSTQL 

GEEPFSYGYGGTGKKSTNSRFENYGDKFAENDVIGCFADFECGM 

DVELSFTKNGKWMGIAFRIQKEALGGQALYPHVLVKWCAVEFMF 

GQRAEPYCSVLPGFTFIQHLPLSERIRGTVGPKSKAECEILMMV 

GLPAAGKTTWAIKHAASNPSKKYNILGTNAIMDKMRVMGLRRQR 

NYAGRWDVI,IQQATQCLNRI.lQIAARKKRNYIIiDQTWVYGSAQR 

RKMRPFEGFQRKAIVICPTDEDLKDRTIKRTDEEGKDVPDHAVL 

EMKANFTbPDVGDFIiDEVLFIELQREEADKLVRQYNEEGRKAGP 

PPEKRFDNRGGGGFRGRGGGGGFQRYENRGPPGGNRGGFQNRGG 

GSGGGGtnrRGGFWRSGGGGYSQNRWGNKNRDNNNSNNRGSYMRA 

PQQQPPPOQPPPPQPPPQQPPPPPSYSPARNPPGASTYNKNSNI 

PGSSANTSTPTVSSVTSPPQSPGFFPSTFQPSYSQPPYNQGGYSQ 

GYTAPPPPPPPPPAYNYGSYGGYNPAPYTPPPPPTAQTYPQPSY 

KQYQQYAQQWNQYYQNQGQWPPYYGNYDYGSYSGNTQGGTSTQ 

GTRGRKTGRVAAPSTRRRTGWWQKUJTRSPAMSLSDPGliGYHPr 



T 



581 
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PCT/USOO/34263 



SEQ 
ID 
WO: 


Predicced 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal p^^^^tlHin 
(A=Alanxne, C=Cy3teine, D^^Aspartic Acid, E- 
Glutamic Acid, F=Phenylalanine, Gt=Glycine 
H-Histidine, I=Isoleucine, K«Lysine, 
li^Leucine, M==Methionine, NsrAsparagine, 
P=Proline, Q-GLutamine, R=Arginxne, 
S=Serine, T^Threonine, V»Valine, 
W-Tryptophan, Y-Tyrosine, X-Unknown, *«=Stop 
Codon, /^possible nucleotide deletion, 
\«pos3ible nucleotide insertion) ^ 








CWTLRWPPI.c:SLHALHVFHCI.FSSRLGTPVSPRLAMDPNCSCEA 

GGSCACAGSCKCKKCKCTSCKKSCCSCCPLGCAKCAQGCICKGA 
SEKCSCCA 


6981 


10 


1054 


PGRGFRRASLRPAFAARGVFQGGLGQAKQARTRACAALPTPHPS 
APRLLEPQGVFSLFPPPPGPWPHMIIjTKAQYDEIAQCLVSVPPT 
RQSLRKLKQI^FPSQSQATLLSIFSQEYQKHIKRTHAKHHTSEAI 
ESYYQRYLNGWKNGAAPVUI,DLANEVI)yAPSLMARLIL,ERFLQ 
EHEETPPSKSIINSMLRDPSQIPDGVLANQVYQCIVNDCCYGPL 
VDCIKHAIGHEHEVLLRDLLLEKNLSFLDEDQLRAKGYDKTPDF 
ILQVPVAVEaHIIHWrESKASFGDECSHHAYLHDQFWSYWNRFG 
PGLVIYWYGPIQELDCNRERGlLLKACFPTNIVTLCHSrA 


6982 


153 


128S 


fpqqdcsapaapglagseprrlrayrrrrqrarglkrVawlKpp 

PSLLQGLQGWAQAPVDGTLGPEDSRASSPMIQNSRPSLI^PQDV 
GDTVETLMLHPVIKAFL.CGSISGTCSTLLFQPLDLLKTRLQTLQ 
PSDHGSRRVGMIiAVLLKWRTESliLGLWKGMSPSIVRCVPOVGI 
YFGTLYSLKQYFLRGHPPTALESVMLGVGSRSVAGVCMSPITVl 
KTRYESGKYGYESIYAALRSIYHSEGHRGLFSGLTATLLRDAPF 
SGIYLMFYNQTKWIVPHDQVDATLIPITNFSCTGIPAGILASLVT 
QPADVIKTHMQLYPLKFQWIGQAVTLIFKDYGLRGPFQGGIPRA 
LRRTLMAAMAWTVYEEMMAKMGIiKS 


6983 


82 


773 


EMSFLQDPS FFTMGMWS IGAGAIiGAAAliAia.IiANTDVFL.SKPQK 

AALEYLEDIDLKTLEKEPRTFKAKELWEKNGAVTMAVRRPGCPL 

CREEAADIiSSLKSMLDQLGVPLYAWKEHIRTEVKDFQPYFKGE 

IFI*DEKKKFYGPQRRKMMFMGFIRLGVWYNFFRAWNGGFSGKLE 

GEGFILGGVPWGSGKQGriXEHREKEFGDKVNLLSVLEAAKMI 
KPQTLASEKK 


6984 


1845 


1282 


ggrsayslpagslprvpataaakmasgvqvadevcrxfydmkvr" 

KCSTPBBIKKRKKAVIFCIiSADKKCXIVEEGKEILVGDVGVTIT 
DPFKHrVGMI.PEKDCRYALYDASFETKESRKEELMFFI,WAPEIA 

plkskmiyasskdaikkkfqgikhecqangpedlnraciaeklg 

GSIiIVAFEGCPV 


6985 


1887 


1324 


RRTAGIYPCi'PKPGRTRKALCSWLLIJjTGQIAFDDFQESCAMM 

wqkyagsrrsmplgarilphgvfyaggfaivyyliqkphsraly 

YKLAVEQLQSHFEAQEAIiGPPRJIHYLKLIDRElIFVDlVDAKLK 
j.i;'V£>i;ji»^t»fc,t3XjLiXVHSSRGGPFQRWHIjDEVFIjELiaX3QQlPVFK 
LSGENGDEVKKE 


6986 ■ 


642 


1350 


YHLYFKMGDPNS RKKQALNl^LRAQLRKKKESLADQFDPKMY lAF 
VFKEKKIOCSALFEVSEVIPVMTNNYEENILKGVRDSSYSLESSL 
ELLQKDWQLHAPRYQSMRRDVIGCTQEMDPILWPRNDIEKIVC 
LLFSRWKESDEPPRPVQAKFEFHHGDYEKQFLHVLSRKDKTGIV 
VNNPNQSVFLFIDRQHLQTPKNKATIFKLCSICLYLPQEQLTHW 
AVGTIEDHIiRPYWPE 


6987 
6968 


1623 


341 


LEAAEKASRAFKESQRQTDSKNYETENWSPQKSQRRYDMYNTAC" "" 

Fl^ElEVGLYTIQILQLTPFFHmJELSKKHMVQFLSGKWTIPP 

DPRNECYLALSKFTSHLKNI,QSDLKRCF0FFIDYMVLLKMRYTQ 

KEIAEIMI^SKKVSRCFllICYTELFCHLDPCIiIiQSKESQLLQEENC 

RKKLBT^RADRFAGLLEYLKrPNYKDATTMESIVNEYAFLLQQNS 

KKPMTNEKQNS ILANI XLSCLKPNSKl. IQPLTTLKKQLRE VIXJF 

VGLSHQYPGPYFIiACLLFWPENQEUJQDSidiXEKYVSSLNRSFR 

GQYKRMCRSKQASTLFYLGKRKGLNSIVHKAKIEQYFDKAQNTN 

SLWHSGDVWKKNEVKDLI,RRLTGQAEGia»rSVEYGTEEKIKIPV 

ISVYSGPLRSGRNI ERVSFYLGFSIEGPPGL 




3 


689 


TQLLRRPAVFVGSAASGIRSGLWSASSGHWCAPAAGRAHAPVPR 
LVRGLGAASTAAPQDAQTGPQPMPRADCIMRHLPYFCRGQWRG 
FGRGSKQLG I PTANFPEQWDNLPADISTGIYYGWAS VGSGDVH 
KMVVSIGWNPYYKNTKKSMETHrMHTFKEDFYGEILEWArVGyi. 



1^ 



582 



wo 01/53312 



PCT/USOO/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Asparfcic Acid^ E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H«Histidine, I=Isoleucine, K-Lysine, 
lj=Ijeucine , M=Me thionine , N=Asparagine , 
pwProline^ Q=Glutamine, RsrArginine, , 
S^Serine » T-Threonine . VssVal ine, 
W-Tryptophan, Y^Tyrosine, X=Unknovm, *»Stop 
Codon, /^.possible nucleotide deletion, 
\=poaaible nucleotide insertion) > 








RPEKNFDSLESLISAIQGDIEEAKKRIiELPEHIjKIKEDNFFQVS 
KSKIMNGH 


6989 


2 


1118 


LMPSDRPI.SPSTHASAGSHCHAPPTTARRAFPIPFGSKSNMATL 
KDQLIYNLLKEEQTPQNKITWGVGAVGMACAISILMKDLADEL 
AliVDVIEDKLKGEMMDLQHGSLFUlTPKIVSGKDYNVTANSKIiV 
I ITAGARQQEGESRLNLVQRNVWIFKFI IPNWKYSPNCKLI,! V 
SNPVD IIiTYVAWKI SGFPKNRV IGSGCNLDSARFRYLMGERIiGV 
HPLSCHGWVLGEHGDSSVPVWSGMNVAGVSJjKTIjHPDLGTDKDK 
EQWKEVHKQWESAYEVIKLKGYTSWAIGLSVADIAES IMKNLR 
RVHPVSTMIKGIjYGIKDDVFLSVPCILGQNGISDLVKVTLTSEE 
EARLKKSADTLWGIQKEliQF 


6990 


719 


2S8 


THASGMASWLALRTRTAVTSLLSPTPATAIAVRYASKKSGGSS 
KNLGGKSSGRRQGIKKMEGHYVHAGNIIATQRHFRWHPGAHVGV 
GKNKCLYALEEQIVRYTKEVYVPHPRNTEAVDLITRIiPKGAVLY 
KTFVHWPAKPEGTFKLVAML 


6991 


169 


451 


RRSSDFHN POFLS R P VSLREN I HHQV I CSTKNKRRNPKKIAYLL 
SSLLMTNLNPKESTENQPVDAYWAFTLDQEFLTYACVEGTGCLF 
CGRHVH 


6992 


944 


510 


RQAPGCS S LALRQ VRQVYCGliVRAPQVQTRPIiSSRFVERRGALY 
RSPMNQENPPPYPGPGPTAPYPPYPPQPMGPGP^4GGPYPPPQGY 
PYQGYPQYGWQGGPQEPPKTTVYWEDQRRDELGPSTCLiTACWT 
ALCCCCIiWDMLT 


6993 


1 


374 


"'qwcvtcpqhnarqgpavppgiqaygaapfedlqvdftemskcrg 
drvwiknwnvaslcplwkgpqtwlspptavkvegipawihhsh 
vkpaaretwearpspdwpfrvtlkkttspapvtpgs 


6994 


346 


1100 


QWPBKDPVMAASSISSPWGKHVFKAIIoMVLVALIIiLHSALAQSR 
RDFAPPGQQKREAPVDVLTQIGRSVRGTLDAWIGPETMHLVSES - 
SSQVLWAI SSAI S VAFFAIiSGIAAQIiLNALGIiAGDYtiAQGLKLS 
PGQVQTFLLWGAGALWYWLLSLLIiGLVLAI.riGRILWGIiKLVIF 
LAGPVALMRSVPDPSTRALI>I#I*ArJ*Il4YALLSRLTGSRASGAQL 
EAKVRGLEROVEELRWRQRRAAKGARSVEEB 


6995 


144 


1346 


GSVAVGLSGIMAAQKDLWDAIVJGAGIQGCFTAYHLAKHRKRIIi 
IiLEOFFLPHSRGSSHGQSR'riRKAYLEDFYTRMMHECYQIWAQL 
BHEAGTQIiHRQTGIiLLIiGMKENQELKTIQANliSRQRVEHQCI^SS 
EEIiKQRFPNIRLPRGEVGLLDNSGGVIYAYKALRAIODAIRQLG 
G I VRDGEKWE INPGLLVTVKTTSRS YQAKSI^VITAGPWTNQIjI:. 
RPIiG I EMPLQTLRIJNVCYWREMVPGS YGVSQAFPCFLWLGLCPH 
HIYGLPTGEYPGLMKVSYHHGNHADPEERDCPTARTDIGDVQIXi 

GAGPSGHGFKIAPWGKlLYELSMKIiTPSYDLAPFRISRFPSLG 
KAIlIj 


5996 


543 


1942 


'ETANAEAAARKSAMDWKEVBRRRXATPNTCPNKKKSEQELKDEE 
MDLFTKYYSEWKGGRKNTWEFYKTIPRFYYRLPAENEVIjLQKLR 
EESRAVFLQRKSRELLDNEELQNLWFLU5KHQTPPMIGEEAMIN 
YEMPLKVGEKAGAKCKQFFTAKVPAKLLHTDSYGRISIMQPFNY 
VMRK:VWIUHQTRIGLSLYBVA6QGYI»RESDLEWYriiEI.lPTLPQL 
DGIiEKSFYSFYVCTAVRKPFFFLDPI»RTGKIKIQDtrACSFLDD 
LLELRDEEIjSKESQETNWFSAPSALRVYGQyiiNLDKDHNGMLSK 
EEXiSRYGTATMTmrFLDRVFQECLTYDGEMDYKTYIjDFVIiALEN 

rkepaalqyifklldienkgylnvfslnyffraiqelmkihgqd 
pvs pqdvkdel pdmvkpkdplkislqdlimsnqgdtvttilidl 

NGFWTYENREAliVANDSENSADLDDT 


6997 


370 


1104 


AMELTIFlLRLAIYILTFPIiYLUlJFLaLWSWICKiCWFPYFliVRF 
TVIVNEQMASKKREIiFSKLQEPAGPSGKLSLIiEVGCGTGANFKF 
YPPGCRVTCIDPNPNFEKFLIKSIAENRHLQFERFWAAGENMH 



58^ 



wo 01/53312 



PCT/US00/342<;3 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=»Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H-Histidine, I =Is ©leucine, K=Lysine, 
L=Leucine, M^Methionine , N-Asparagine , 
Pi=Proline, Q=Glutataine, R=Arginine, 
S=serine, T=Threonine, Vs=Valine, 
W=Tryptophan, y»Tyrosine, X -Unknown, *«S.top 
Codon, /»=possible nucleotide deletion, 
\=pos0ible nucleotide insertion) C 








Q VAIXSS VDVWCTLVLCS VKNQERI IjREVCRVLRPGGAF VFMEH ■ 
VAAECSTWNYFWQQVLDPAWHLLFDGCNIiTRESWKAItERTlSFSK 
LKLQHIQAPLSWELVRPHIYGYAVK 


6998 


2 


616 


FVSRALLRVRSRRHPAEERAAPGRPEDAPIECPGATNCPEPLWC 
SHLPVPYAPPTMESRGKSASSPKPDTKVPQVTTBAKVPPAADGK 
APLTKPSKKEAPAEKQQPPAAPTTAPAKKTS7VKADP7VLLNNHSN 
LKPAPTVPSSPJDATPEPKGPGDGAEEDEAASGGPGGRGPWSCEN 
FNPLXiVAGGVAVAAIALILGVAFLVRKK 


6999 


14 


1591 


GRAGACSRRDTAMSIEIESSDVIRLIMQYIjKENSLHRAIiATLQE 
ETTVS lOTVDS I ES FVADIMSGHWDTVLQAIQSLKLPDKTLIDL 
YEQWLELIELRELGAARSLLRQTDPMIMUCQTQPERYIHLBNL 
LARSYFDPREAYPDGSSKEKRRAAIAQALAGEVSWPPSRLMAL 
LGQALKWQQHQGI^LPPGMTIDLFRGKAAVKDVEEEKFPTQbSRH 
rKFGQKSKVECARPSPDGQYLVTGSVDGFIEVWNFrTGKIRBCDL 
KYOAQDNFMMMDDAVLCMCFSRDTEMLATGAQDGKIKVWKIQSG 
QCI*RRFERAHS KGVTCLS FS KDSSQI LSASFDQTIRIHGLKSGK 
TLKE FRGHSS F VNEATFTQDGHYl 1 S ASSDGTVKI WNMKTTECS 
NTFKS hGS TAGTDI TVNS VI LLPKNPEHPWCNRSNTWIMNMQ 
GOI VRS FSSGKREGGDFVCCALS PRGEWr YCVGEDFVLY CFSTV 
TGKLERTLTVHEKDVIGIAHHPHQNLXATYSEDGLLKLWKP 


7000 


2 


827 


GPGWFLELMESEGPPESERSEPFSQREEENEEEEAOEPEETGP 
KNPIiLQPAIiTGDVEGr^QKlFEDPENPHHEQAHQLLLEEDlVGRN 
LLYAACMAGQSDVIRALAKYGVNLWEKTXRGYTLLHCAAAWGRli 
ETLKALVELDVDIEALNFREERARDVAARYSQTECVEFLDWADA 
RLTLKKYIAKVSIAVTDTEKGSGKLLKEDKNTILSACRAKNEWL 
ETHTE AS INELFEQRQQLED I VTPI FTKMTTPCQVKSAKS VTS^ 
DQKRSQDDTSN 


7001 


20S6 


B44 


RRCIjIIAPLKaCFIFIYFIFIFETEFLSGCPGWSAVAQSRLIAN 
FASQVQAIFILPKDSQVGPDVKSEAAPKRAIiYESVFGSGEICGP 
TS PKRliCIRPSEPVDAWWS VKHDPLPLIiPEANGHRSTNS PTI 
VS PAI VS PTQDSRPNMSRPLITRSPAS PIiNNQGI PTPAQLTKSN 
APVHIDVGGHMYTSStATLTKYPESRlGRLFDGTEPIVLDSLKQ 
HYFIDRDGQMFRYIIiNFURTSKLLIPDDPKDYTLLYEEAKYFQL 
QPMLLEMERWKQDRETGRFSRPCECLWRVAPDLGERITLSGDK 
SLIEEVFPEIGDVMCNSVNAGWKHDSTHVIRFPIiNaYCHiNSVQ 
VLERLQQRGFEIVGSCGGGVDSSQFSEYVIjRREIiRRTPRVPSVI 
RIKQEPLD 


7002 


1043 


498 


PMPSS TRWrXS * TYTDTSS AWACRPrrGTCT* rAAPGPT VR WWP 
TP CS RHQS RRRl»TCWCi>T£»RP CGR*GGLiCVRTAP T 

swtsagtswpagrrtgtatsgtatttsvwpgcgtrmwstqwssv 

PRSRSCCSRPATTPPSKPGAPHAPCASSRHLAHGLAPSSPGLPA 

rgaevc 


7003 


818 


61 


QGRFRAFCWQRDF1>QPPGMRL.SAL1iAIjASKVTI»PPHYRYGMSPP 

gsvadkrknppwirrrpvwepisdedwylfogdtveilegkda 
gkqgkwqvirqrnwvwgglnthyryigktmdyrgtmipseap 

LLHRQVKLVDPMDRKPTEI BWRFTE?U3ERVRVSTRSGRI IPKPE 
FPRADGIVPETWIDGPKDTSVEDALERTYVPCLKTLQEEVMEAM 
G I KETR\NTRRS IGI EPGAEQIiL.PNFCPSLEG 


7004 


121 


2285 


FLLPVI^TSRSLRQPAVPHARLGGVEPAAMKSARAKTPRKPTVKK 
GNPKRTLKTQLG/YYOfiVRPLGFPDQECCIEVINNTTVQLHTPE 
GYRLNRUGDYKETQYSFKQVFGTHTTQKELFDWANPLVWDIjIH 
GKNGLIiFTYGVTGSGKTHTMTGSPGEGGLLPROUDM I FNS IGS F 
QAKRYVFKSNDRNSMDIQCEVDALIiERQKREAMPNPKTSSSKRQ 

vdpefadmrxvqefckaeevdedsvygvfvsyieiynnyiydll 
eevpfdpinpnlhnlncfvkiknhnmyvagctevevkstebafe 

VFWRGQKKRR lANTHIiNRESSRSHSVFWI KLVQAPLDADGDNVti 
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SEQ 
ID 
NO ; 


Predicted 

beginning 

nucleotide 

location 

cor re s p ondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid cegraent containing signal peptide 
{A=Alanine, C^Cysteine, P=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I^isoleucine^ K=^Lysine, 
L=Leucine, M-Methionine, N=Asparagine , 
P-Proline. Q=tGlutamine, R^zArginine, 
S -Serine, T^Threonine, V«Valine, 
W-Tryptophan, Y^Tyrosine, X-Unknown, *=Stop 
Codon, /=po$3ible nucleotide deletion, 
\=possible nucleotide insertion) ^ 








QEKEQITISQLSLVDLAGSERTNRTRAEGNRI^REAGNINQSLMT 
laTCMDVLRENQMYGTNKMVPYRDSKLTHLFKNYFIXSEGKVRMI 
VCVNPKAEDYEENLQVMRFAEVTQEVEVARPVDKAICGLTPGRR 
YRNQPRGP\IGNEPLVTDWLQSFPPLPSCEILDINDEQTLPRL 
lEALEKRHNLRQMMIDEPNKOSNAFKALLQEPDNAVLSKENHMQ 
GKlM:KEKMISGQKLEIERLEKKNKTr,EYKIEII*EKTTTIYEED 
KRNLQQELETQNQKI^RQFSDKRRLEARIiQGMVTETTMKWEKEC 
ERRVAAKQLEMQNKLWVKDEKLKQLKAIVTEPKTBKPERPSRER 
DREKVTQRSVSPSPVPVSYIj 


700S 


63 


876 


RNMALYQRWRCLRLQGLQACRLHTAWSTPPRWLAERLGLFEElT" 

WAAQVKRLASMAQKEPRTIKISLPGGQKIDAVAWNTTPYQLARQ 

ISSTIADTAVAAQVNGEPYDLERPLETDSDLRFLTFDSPEGKAV 

FWHSSTHVLGAAAEQFLGAVLCRGPSTEYGFYHDFFIiaKERTIR 

GSELPVLERICQEliTAAARPFRRLEASRDQLRQLFKDWPPKIiHL 

IEEKVTGPTATVYGCGTI,VDLCQGPHIiRHTGQIGGl.KLLSNSSS 
LWRSSG 


7006 


22 


898 


NAFGRHSTAVKMAAAAWLQVIiPVILLiLLGAHPSPt>SPFSAG"^Sf~ 

VTUUVDRSKWHIPIPSGKNYPSFGKILFRNTTIFLKFDGEPCDIiS 

LNITWYLKSADCYNEIYNFKAEEVELYLEKLKEKRGLSGKYQTS 

SKLPQNCSELPKTQTFSGDFMHRIiPIiLGEKQEAKENGmLTPIG 

DKTAMHEPLQTWQDAPYIFIVHIGISSSKESSKBNSLSNTiFTMT 

VEVKGPYEYLTLEDYPLMIFFMVMCIVYVLFGVLWIiAWSACYWR 

DLbRIQFWIGAVlFLGMIiEKAVFYAGFQ 


7007 " 


2 


1001 


AMTVSGPOTPEPRPATPGASSVEQLRKEGNELPKCGDYGGiCLAA 
YTQALGLDATPQDQAVLHRNRAACHUCLEDYDKAETEASKAIEK 
DGGDVKALYRRSQALEKt^RLDQAVLDLQRCVSLEPKNKVPQEA ' 
LRNIGGQIQEKVRYMSSTDAKVEQMFQIXiLDPEEKGTEKKQKAS'*-' 
QNIiWLAREDAGAEKI FRSNGVQLIiQRLLDMGETDLMLAAtRTL 
VGICSEHQSRTVATLSII.GTRRWSILGVESQAVSI1AACHLLQV 
MFDALKEGVKKGFRGKEGAIIVGEWKjQVWGLLDVTVWEGMGriSQ 
PGQFFGDQTCSCRLFG I RFGDI I LL 


7008 


70 


1478 


CRSALGHERPPPAHLPAGGRRLQTCPRSCRWIifeRPPSGIiPPGPR^ 

SPPPLAGPGQKMVQKKPAELQGFHRSFKGQNPFELAPSLDQPDH 

GDSDFGLQCSARPDMPASQPID I PDAKKRGKKKKRGRATDSFSG 

RFEDVYQLQEDVLQEGAHARVQTCINLITSQEYAVKIIEKQPGH 

IRSRVFREVEMLYQCQGHRNVLELIEFPEEEDRFYLVFEKMRGG 

S I LSHIHKRRHFNELEAS WVQDVAS ALDFIiHNKGIAHRDLKPE 

NILCEHPNQVSPVKICDFDLGSGIKLNGDCSPISTPEUCTPCGS 

AEYMAP£rWEAFSBEASryDKRCDLWSLGVir.yXLl»SGYPPPVG 

RCGSDCGWDRGEACPACONMLFP'JirjPnJCYPPPnTmwaTTTcr'aa 

KDLISKLIiVRDAKQRLSAAQVLQHPWVQGCAPEWTLPTPMVLQR 
WDSHFLIiPPHPCRIHVRPGGLVRTVTVNB 


• 7009 


1 


626 


ARQLRNSWVDDFVAAPliIPIiSQQIPTGKSLYESYYKQVDPAYTG 
RVGASEAALFLKKSGLSDIIIiGKIWDLADPEGKGFIiDKQGFYVA 
LRLVACAQSGHEVTLSNIJJLSMPPPKFHDTSSPIjMVTPPSAEAH 
WAVRVEEKAKPDGI FESLLPINGLLSGDKVKPVTJ^NSKLPLDVL 
GRVWDLSDIDKDGHIiVRDEFAVAMHLVYRALE 


7010 


79 


571 


SHTRRAWPETtjLSPLCPliIjGGGTAMSGGEQKPERYYVGVnVGT' 
GSVRAAIiVDQSGVIiLAFADQPIKNWEPQFNHHEQSSEDIWAACC 
WTKKWQG IDLNQI RGLGFDATCSIiWLDKQFHPLPVNQEGDS 
HRNVIMWLDHRAVSQVNRINETKHSVLQYVGG 


7011 


3 


994 


RIQTLPNQNQSQTQPLLKTPPAVLOPIAPQTTFGVQTOPQPQSL 
LQAQISAASITPLLQTQPQPLLQQPQQKAGLLQPPVRIVSQPQP 
ARRIiDPPSRFSGRWDRGDQVPNRKDDRSRERERERRRSRERSPQ 
RKRSRERS PRRERERS PRRVRRVVPRYTVQFSKFSUDCPSCDMM 
ELRRRYQNLYIPSDFFDAQFTWVDAFPIiSRPPQLGNYCNFYVMH 



4 

585 



A 



wo 01/53312 



PCT/USOO/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
correspondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acict segment containing signaX"H^HtIdr~I 
(A=Alanine, C=Cy3teine. D=Aspartic Acid, 
Glutamic Acid, F^Phenylalanine, G^Glycine, 
H-Histidine, I==Isoleucine, K=Iiysine, 
L^Leucine, M=Methionine, N«AsparagiAe 
P-Proline, Q-Glutamine, R-Arginine, ' . 
S=Serine, T=Threonine, V«*Valine, 
W-Tryptophan, r=Tyroeine, X«UaJcAown, *-Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) ' ^ 


7012 






REVESLEKWMAILDPPDADHLySAKVMimSPSMEDi:Y!^ 

AEDPQELKDGFQHPARLVKFLVGMKGKDEAMAIGGHWSPSLDGP 
DPEKDPSVL1KT\AIRCCKALTG 


7013 


1 


2661 


RRAGt>VKKCifc;AKI,FGPTEROSERPLRPSAARKPHMLSGKKAAAA 
AAAAAAAATGTEAGPGTAGGSENGSEVAAQPAGLSGPAEVGPGA 
VGERTPRKKEPPRASPPGGLAEPPGSAGPQAGPTWPGSATPME 
TGIAETPEG\RRTSRRICRAKVEyRBMDBSLANLSEDEYYSEBER 
NAKAEKEKKLPPPPPQAPPEEEWESEPEEPSGVEGAAFQSRLPH 
DRMTSQEAACFPDIISGPQQTQKVFLFIRHRTU3LWLDMPKIQL 
TFEATLCXJI^EAPYNSDTVLVHRVHSYLERHGLINFGIYKRIKPL 
PTKKTGKVI I XGSGVSGLAAARQLQSFGMDVn^LEARBR VGGRV 
ATPRKGNYVADLGAMWTGLGGNPMAWSKQVNMELAKIKQKCP 
LYEAJTGQAVPKEKDEMVEQEFNRLLEATSYLSHQLDPNVLNNKP 
VSLGQALEWIQLQEKHVKDEQIEKWKKIVKTQEEtiKELLNKMV 
NLKEKIKELHQQYKEASEVKPPRDITAEFLVKSKHRDLTALCKE 
YDEI*AETQGiaEEKI>QELEANPPSDVYt.SSRDRQILDWHFANLE 
FANATPLSTLSLKHWDQDDDFEFTGSHLTVRNGYSCVPVAIxJ^G 
LDIKJLNTAVRQVRYTASGCEVIAVWTRSTSQTPIYKCDAVLCTL 
PLGVLKQQPPAVQFVPPLPEW1CTSAVQRMGFGNI.NICVVLCFDRV 
FWDPSVNLFGHVGSTTASRGELFLPWNLYKAPILrALVAGEAAG 
IMENXSDDVIVGRCLAILKGIPGSSAVPQPKETWSRWRADPWA 
RGSYSyVAAQSSGNDYDlMAQPXTPGPSIPGAPQPIPRLFFAGE 
HTI RNYPATVHGALLSGIiREAGRIADQFLGAMYTLPRQATPGVP 
AOQSPSM 


7014 


1 


2661 


RRAGSVKRGKARLFGPTERQSERPLRPSAARRPEML56KKAAAA:'" 

AAAAAAAATGTEAGPGTAGGSENGSBVAAQPAG1,SGPAEVGPGA 

VGERTPRKKEPPRASPPGGLAEPPGSAGPQAGPTWPGSATPME^ 

TGIAETPEG\RRTSRRKRAKVEYREMDESliANLSEDEyySEEER 

NAKAEKBKKiiPPPPPQAPPEEENESEPEEPSGVEGAAFQSRLPH 

DRMTSQEAACPPDirSGPQQTQKVFLFIRNRTLQLWLDNPKrQL 

TPEATLQQIjEAPYNSDTVLVHRVHSYbERHGLINFGIYKRIKPL 

PTKKTGKVI I IGSGVSGLAAARQLQS FGMDVTLLEARDRVGGRV 

ATFRKGNYVADLGAMWTGLGGNPMAWSKQVNMELAKIKQKCP 

LYEANGQAVPKEKDEMVEQEFNRLLEATSYLSHQLDFNVLNIXKP 

VSLGQALEWlQLQEKHVKDEQXEHWKKIVKTQEBLKEIiIiNKMV 

NLKEKIKELHQQYKEASEVKPPRDITAEFLVKSKHRDLTAICKE 

YDELAETQGKLEEKLQELEANPPSDVYLSSRDRQILDWHFANIiE 

FANATPLSTI.SLKFIWr)QDDDFEFTGSHLTVRNGYSCVPVALAEG 

r^DlKLNTAWOVRYTASGCEVIAVrmiSTSQTPIYKCnAVLCTL 

PLGVLKQQPPAVQFVPPLPEWKTSAVQRMGFGNLNKWLCPDRV 

FWDPSVNLFGHVGSTTASRGELFLFWNl^YKAPILLALVAGEAAG 

IMENISDDVIVGRCLAILKGIFGSSAVPQPKETWSRWRADPWA 

RGS YS YVAAGSSGNDYDLMAQPITPGPS XPGAPQPI PRX.FFAGE 

HTIRNYPATVHGALLSGLREAGRIADQFLGAMYTLPRQATPGVP 
AQQSPSM 




3 


39S0 

: 


UbfcVGDKXRlL.ATI.EDGWLEGSLKGRTGlFPYRPVKLCPDTRVE 
ETMALPQEGSLARIPETSLDCLENTLGVEEQRHETSDHEAEEPD 
CIISEAPTSPLGHLTSEYDTDRNSYQDEDTAGGPPRSPGVKWEM 
PLATDSPrsDPTEWNGlSSOPQVPFHPNLQKSQYYS'rVGGSHP 
HSEQypDLLPLEARTRDYASLPPKRMYSQLKTFLQKPVLPLYRGS 
SVSASRWKPRQSSPQLHNLASYTKKHHTSSVYSISERIiEMKPG 
PQAQGIiVMEAATHSQGDGSTDLDSKLTQQLIEFEKSIiAGPGTEP 
DKILRHFSIMDFNSBKDIVRGSSKLITEQELPERRKALRPPPPR 
PCTPVSTSPHLLVDQNLKPAPPLWRPSRPAPLPPSAQQRTKAV 
5PKLLSRHRPTCETLEKEGPGHMGRSLDQTSPCPLVLVRIEEME 
^iDLDMYSRAQEELNLMLEEKQDBSSRTVETLEDLKFCESNIESLIT^ 
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SEQ 
ID 
NO: 


Predicteci 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C«Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, P=. Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K-Lysine, 
L=Leucine, M=Methionine, N-Asparagine, 
P-Proline, Q=Glutamine, R=Arginine, 
S -Serine. T=Threonine, V=:Valine, 
W^tTryptophan, YsTyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=poseible nucleotide insertion) ^ 








MELQQLREMTLLSSQSSSLVAPSGSVSAENPEQRMIjEKRAKVIE 
EIiLQTERDYIRDLEMCIERIMVPMQQAQVPNIDFEGLFGNMQKV 
liCVSKQLLAALEISDAVGPVFLGHRDELEGTYKiyCQNHDEAlA 
LLEIYEKDEKIQKHLQDSLADLKSLYNEWGCTNYINLGSFLIKP 
VQRVMRYPLLLMELLNSTPESHPDKVPLTNAVLAVKEtNVNINE 
YKRRKDLVLKYRKGDEDSLMEKISKLNIHSIIKKSNRVSSHbKH 
LTGPAPQIKDEVFEETEKNFRMQERIilKSPIRDLSLYLQHIRES 
ACVKWAAVSMWDVCMERGHRDLEQFERVHRYISDQLFTNFKER 
TERLVISPLNQLLSMFTGPHKLVQKRFDKLLDFYNCTERAEKLK 
DKKTLEELQSARNNYEALNAQLLDELPKFHQYAQGLFTNCVHGY 
AEAHCDFVHQAIiEQLKPLLSLLKVAGREGNLIAIFHEEHSRVLQ 
QLQVFTFFPESLPATKKPFERKTIDRQSARKPLLGLPSYMLQSE 
ELRASLIiARYPPEKLFQAERNFNAAQDI*DVSLLEGDt,VGVIKKK 
DPMGSQNRWLIDNGVTKGFVYSSFLKPYNPRRSHSDASVGSHSS 
TESEHGSSSPRFPRQNSGSTLTFNPN\S\MAVSFTSGSCOKQPQ 
DASPPPKEWDQQTI^ASLNPSNSESSPSRCPSDPDSTSQPRSGD 
SADVARDVKQPTATPRSYRNFRHPEIVGYSVPGRNGQSQDXiVKG 
CARTAQAPEDRSTEPDGSEAEGNQVYFAVYTFKARNPNEI4SVSA 
KQPCLKII.EFKDVTGNTEWHIiABVNGKKGYVPSNYIRKTEYT 


7015 


1842 


513 


RQAWHEXVAAPSWRGARLVQSVLRVWQVGPHVARERVIPFSSLL 
GFQRRCVSCVAGSAFSGPRLASASRSNGQGSALDHFLGFSQPDS 
SVTPCVPAVSMNRDEQDVLLVHHPDMPENSRVLRWIjLGAPNAG 
KSTLSNQLLGRKVFPVSRKVHTTRCQAliGVITEKETQVlLLDTP 
GI IS PGKQKRHHliELSIj:.EDPWKSMESADLWVl4VDVSDKWTRN 
QI*SPQLLRCLTKTSQIPSVLVM^7KVDCLKQKSVI,tiEr»TAALTEG 
WNGKKIjKMRQAFHSHPGTHCPSPAVKDPNTQSVGMPQRIGWPH 
FKEI FMLSALSQEDVKTLKQYLLTQAQPGPWE YHSAVLTSQTP^ 
EI CANI IREKLI.EHX.PQEVP YNVQQKTAVWEEG PGGBhVXQQKIj 
LVPKES YVKLLIG PKGHVISQIAQEAGHDI4MDI FIX:DVDIRIiSV 
KLLK 


7016 


167 


2513 


ILNAP KPPPPRDS VEAVAAKRDTGGGS WGTGMDVSGQETDWRST 
AFRQKXiVSQIEDAMRKAGVAHSKSSKDMESHVFliKAKTRDEYIiS 
LVARLI IHFRDIHNKKSQTVSVSDPMNALQSLTGGPAAGAAGIGM 
PPRGPGQSLGGMGSLGAMGOPMSLSGQPPPGTSGMAPHSMAWS 
TATPQTQLQLQQVAAAAAAATARSSSSSSRRRYSSSSSSSNSKQ 
FOAQQSAMQQ\QFQA\WQQgQQL\QQQOQQQQKI.XKX,HHQKQQ 
QXQOQQQQLQRIAOLQLQQQQQQQQQQQQQQQQALQAQPPIQQP 
PMQQPQPPPSQALPQQLQQMHHTQHHQPPPQPQQPPVAQNQPSQ 
LPPQSQTQPLVSQAQAIiPGQMLYTQPPLKFVRAPMWQQPPVQP 
QVQQQQTAVQTAQAAQMVAFGVQVSQSSLPMLSSPSPGQQVQTP 
QSMPPPPQPSPQPGQPSSQPNSNVSSGPAPSPSSFliPSPSPQPF 
\QSPVTARTPQNFSVPSPGPLNTPVNPSSVMSPAGSSQAEBQQY 
liDKLKQLSKYIEPLRRMINKIDKNEDRKKDLSKMKSl»I>DII»TDP 
S KRCPLKTLQKCE lALEKLKNDMAVPTPPPPPVPPTKQQYIiCQP 
LLDAVLANIRSPVFNHSDYRTPVPAMTAIHGPPITAPWCTRKR 
RliEDDERQSIPSVLQGEVARLDPKFLVNLDPSHCSNNGTVHLIC 
KLDDKDLPSVPPIiELSVPADYPAQSPLWIDRQWQYDANPPLQSV 
HRCMTSRLLQLPDKHSVTALLNTWAQSVHOACLSAA 


7017 


1 


1785 


INI^NTCYMNSVI*ALFMATDFRRQVLSLNLNGCNSLMKKLQHl, 
FAFIAHTQREAYAPRIFPEASRppWFTPRSQQtXirSEYIjRFLIiDR 
liHEEEKILKVOASHKPSEILECSETSLQEVASKAAVIiTETPRTS 
DGEKTIilEKMPGGKLRTHtRCLNCRSTSQKAEAFTDLSIiAFWPS 
YSLEYMSCPDCSQS PS IQDGGIiMQASVPGPS EEPWYNPTTAAF 
ICDSLVNEKTIGSPPNEFYCSENTSVPNESNKIJbVWKDVPQKPG 
GETTPSVTDIiLNYFLAPEILTGDNQYYCENCASLQNAJaKTMQIT 
EEPEYLI l^TIiLR FSY DQKYHVRRKrLDNVSLPLVLELP VKRITS 



587 



i 

NSDOCID; <WO..__. 0153312A 1„L> 



wo 01/53312 



PCT/USOO/34263 



SEQ 
ID 
NO: 


Predicted 
beg-inning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acid segment containing signal peptide 
(A-Alanine, C-Cysteine, D==Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, . 
H=Histidine, I=Isoleucine^ K^Lysine, 
L= Leucine, M^Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R»Arginine, 
S=:Serine, T^Threonine, V»Valine, 
w=Trvptophan, Y^Tyrosine, X-Unknown, *=^Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) > 








FSSLSESWSVDVDFTDLSENLAKKLKPSGTDEASCTIOiVPYia^S' 
SVWHSGISSESGHYYSYARNITSTDSSYQMYHQSEAIALASSQ 
SHLLGRDS PSAVFEQDLENKEMS KEWFLFNDSRVTFTS FQS VQK 
ITSRFPKDTAYVLLYKKQHSTNGLSGmiPTSGLMlNGDPPLQKE 
liMDAITKDNKliYLQEQELNARARALQAASASCSFRPNGFDDNDP 
PGSCGPTGGOGGGGFNTVGRIiVP 


7018 


484 


1066 


SLVFRGNTWSGEAGHHCSALFNLAAYHQLFVGTERIRAPEI I FQ 
PSLIGEEQAGIAETLQYILDRYPKDVQEMLVQNVFLTGGNTMYP 
GMKARMEKELIiEMRPFRSSFQVQI*ASNPVLDAWYGARDMAIiNHL 
DDNEVWITRKEYEEKGGEYI.KEHCASNIYVPIRLPKQASRSSDA 
QASSKGSAAGGGGAGEQA 


7019 


1048 


33S 


APGGFLVTMVFPAPS P PWt^CCSHE VTAGPPTLCKDMS AL\?AA~" 
RMRHIPIAPGSDWRDLPHlEVRIiSDGTMT^KIiRYTHHDRKNGRS 
SSGAliRG VCS CVEAGKACDPAARQFNTLI PWCLPHTGNRHNHWA 
GLYGRLEWDGFFSTTVTNPEPMGKQGRVLHPEQHRWSVRECAR 
SQGFPDTYRLFGNILDKHROVGNAVPPPlxAltATftT.TJ'Ttt'T .myrr 
ARESASAKIKEEEAAKD 


7020 


1 


2154 


FADS KRKS VLLDKl KNLQVALTSKQQSLETAMSFVARNTFKRVR 
KG Flrf^RIWAVFFSNT PTRAS PQLREAVLKLSDAG ITPLFliTRQE 
DRQLINALQIWNTAVGHALVI,PAGRDLTDFLENVXiTCHVCL0lC 
NIDPSCGFGSWRPSFRDRRAAGSDVDIDMAFILDSAETTTIiFQF 
NEMKKYIAYLVRQLDMSPDPKASQHFARVAWQHAPSESVDKAS 
MPPVKVE FSLTD YGS KEKIjVDFLSRGMTQLQGTRAI»GSAIEYTI 
ENVFESAPNPRDIiKIWLMLTGEVPEQQLEEAQRVILQAKCKGY 

ffvvxxsigrkvwikevytfasepnijvffklvdkstei^eeplmr 
fgrllpsfvssenafylspdirkqcdwfqgdqptkniivkfghi^ 

VNVPNNVTSS P VTTTKP VTTTKPVTTTTKPVTTTTKPVTI 
INQPSVKPAAJUCPAPAKPVAAKPVATKTATVRPPVAVKPATAAK 
PVTVAKPAAVRPPAAAAAKPVATKPEVPRPQAAKPAATKPATTKP 
MVKMSREVQVFEITENSAKIoHWERPEPPGPYPYDLTVTSAHDQS 
LVLKQNLTVTDRVIGGLLAGQTYHVAVVCYIjRSQVRATYHGS PS 

tkksqppppqparsassstinlmvsteplaltetdicki*pkdeg 
tgrdfilkvrifydpntkscarfwyggcggnbkkfgsqkeceicvca 

PVliAKPGVI S VMGT 


7021 


2 


33B 


VNAVS FFPNG YAFATGSDDATCRIiFDIiRADQELLIiYSHDNI ICG 
ITSVAFSKSGRLXjLAGYDDFNCNVWDTIjKGDRAGVLAGHDNRVS 
CLGVTDDGMAVATGS WDS FIiRI WN 


7022 


2 


8S6 


vyigsfwshpllipdnrklfeaeeqdlfrdiqs'lprnaalrkln 

DL I KRARIiAKVHA YI I SS LKiCEMPS VFGKDNKKKEIiVNNIiAE I Y 
GRIEREHQISPGDFPNLKRMQDQLQAQDFSKPOPIiKSKLLEWD 
DMIAHDIAQLMVLVRQEESQRPIQMVKGGAFEGTLHGPFGHGYG 
EGAGEGIDDAEWVVARDKPMYDEI FYTLSPVDGKITGANAKKEM 
WSKLPNSViySKIWKI^IDKDGMLDDDEFAIiANHLIKVlCIjEGH 
ELPNELPAHIiIiPPSKRKVAE 


7023 


2 


748 


AMVFGGWPYVPQYRDIRRTQNADGFSTYVCLVLIiVANIliRILF 
WFGRRFESPLLWQSAIMILTMLLMLKLCTEVRVANELNARRRSF 
TAADSKDEEVKVAPRRSFLDFDPHHFWQWSSFSDYVQCVLAPTG 
VAGYlTYI^IDSAXjPVETLGFIAVLTEAMLGVPQIiYRNHRHQST 
EGMS I KMVLMWTSGDAFKTAYFLLKGAPIiQFSVCGLiLQVLVDIiA 
XLGQAYAFARHPQKPAPHAVHPTGTKAI* 


7024 


1207 


150 


RTGVTGWAQVWMFGGGGVIiSSGEQIiQMPVKPERGLGPSDGWIiV 
SSRRGSPGrVLGLPFWLLTPVLVSRSIRSMLLLTRSPTAWHRLS 
QLKPPVLPGTLGGQALHLRSWUjSRQGPAETGGQGQPQGPGLRT 
RUiITaiiFGAGLGGAWIALRAEKERIiQQQKRTEAIiRQAAVGQGD 
FHLLDHRGRARCKADFRGOWVLMYFGFTHCPDICPPELEKLVQV 
VRQLEAEPGLPPVQPVFITVDPERTDVEAMARYVQDFHPRLLGL 
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ID 
NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 

amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
aiuj.no acxd 
sequence 


Amino acid segment containing eignal peptxde 
(At=Alanine, C=Cysteine, D=:Aspartic Acid, 
Glutamic Acid, F^Phenylalanine, G=Glycine, 
Ht.Histidine, l^Isoleucine, K^sLysine, 
L=Leucine, M=Methionine, N=Asparagine, 
PfeProline, Q*Glutamine. R-Arginine, 
St^Serine, T=Threonine, V=Valine, 
W^Tryptophan, Y=:TyrosinG, X=Onknown, *=>Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) C 








TGSTKQVAQASHSYRVYYNAGPKDEDQDYIVDHSIAIYLLNPDG" 
XjFTDY YGRSRS AEQI SDSVRRHMAAFRSVIiS 


702S 


232 


832 


ERNS P IGNWENL *K\HSLDCLCFRGDWEGNTUFQTLQDNQEECF 
KQVIRTCEKRPTFNQHTVFNLHQRLNTGDKLNEFKELGKAFISG 
SDHTQHQL I HTS EKFCGDKECGNTFLPDSE VIQYQTVHTVKKT Y 
ECKECGKSFSLRSSLTGHKRIHTGEKPFKCKDCX3KAFRPHSQLS 

vhkrihtgeksyeckecgkafscg 


7026 


328 


1146 


NPNPS IGDlKDIKKAAKSMLDPAHKSHFHPVl'PSIiVFLCFI FDG 
LHQAbLSVGV5KRSNTWGNENEERGTPYASRFKDMPNFIAI>EK 
S5 VU^CCDLLI GVAAGSSDKICTSSLQVQRRFKAMMAS IGRtiS 
HGESADLLISCNAESAlGWISSRPWVGELMFTFLFGDFESPIiHK 
LRKSS*LPRKHR*QPINAVRMFr.DOCKDGSlALRAIVSEIPVFE 
EKKimG*KGIGEIF*WGCrrLPPHYWGAVa:TNVPKIiSNSGKLI/^ 
QDEQPHIFG 


7027 


43 


9S4 


GRRLQQQQRPEDAEDGAEGGGKRGEAGWEGGYPEXVKENKLFEri 
YYQELKIVPEGEWGQFMDAIiREPLPATIiRITGYKSHTVKEILHCL 
KKJCYFKEIiEDLEMDGQKVEVPQPLSWYPEELAWimJLSRJCILKK 
SPHLEKFHQFLVSETESGNISRQEAVSMIPPLLLNVRPHHKILD 
MCAAPGSKTTQLIEMLHADMNVPFPEGFVIANDVDNKRCYLIiVH 
QAKRLSSPCIMWNHDASSIPRLQIDVDGRKEIIjFYDRILCDVP 
CSGDGT^4RKNIDVWKKWTTLNSLQI,HGrJQLRIATRGAEQL 


7028 


189 


608 


SRPPPEPEPGTMVEKGSDSSSEKGGVPGTPSTQSIjGSRN b"! RNS 
KKMQSWYSMLSPTYKQRNEDFRKIiFSKLPEAERLlVDYSCALQR 
ElLLQGRLYLSENWICFYSKriFRWETTISIQIiKEVTCLKKEKTA 
KlilPNAIQ 


7029 


1343 


40 


VLESNTEAKQATGTSSKLRHGTGQE KGREGPRCPSGlAQLRLWG 
/PCPHAGRETGPRASAPIPGS*GHGWHW*RKDGRGERSEGPSAIj'^' 

sphspsliinmqqapthvgpgmgsqrprsswpeqvgvgsqiisre 
rwra*rslpoaaasertemtkersp/rpcqgydssnwftqpgkk 

TRKRNS RRNTMVSRGGGCIiLYPLQS IMPE* QLR*GAHASPPTQG 

r*gkggprs pltkasgtthi ptpffgs ip/rptrdsgpgtdws\ 
aapgqkrghrea*qgpepv/wgrvtthlqgpag* tkplgs \rnw 
vpgpaegeqgegaglegrp*plkgcrstltfspqlsipmvgickp 

PEGTTASFFP\RSCHSE*RKPPPSCPHAPAIiSLPHPLPI,PLPPL 

plplpgagt*hsarsgrpgqsbtgslchnchhcpphcpkcspgg 

T 


7030 


2 


521 


FVCFSAPGSGQGGKRRVNMELSAVGERVFAAETOJbKRRIRKGRM 

KRGPiCPKTFLLKAOAKAKAKTYEFRSDSARGIRI P YPGRSPQDL 
ASTSRAREGIiRN\RVCPRQRAAPAPAAP\PRRGPSGPGPRPG*G 
PGLHFPGPGGPSKHGFVPASEOHQHQQHLPRRGPSGPGPRPG 


7031 


S60 


59 


HCSVPGAEWPRKPPAQICPQLTSRPHLSSPRSLSPGCGHSPGPG 
/CKPS/RHCDELHEGPSRTAALPCX3KPQPKHGVEECG/PCPCIA 
PRRIiTEPPALTVSPVGRAAPSGAL*PSGRACSACSHRLAPEAAL 
SAAAPRPSLGSGQNASGLPAASLPPQDSSQPHKTVPSPARSVPP 
LGAQARAAPPRLWCPRALVSG* EASPEAVSVAAGPPVPGPTPST 
SGSTAS HS RRGC* S PR*TPAPPRRDHGRS AAFEVLTAAASAQPC 
ASQGGPRPTGAGRTPSPLGLPPSRGPPAASARPFCRHPSL 


7032 


1393 


2104 


RRPGRTEPVEPPPVPPPPRASNSKSRCR*RKIiHliAPl**QSPLRK 
SRQIGTSSLPFGRSAGERPRPAATFCLSRGGSSPVFL*PSSSSL 
EPWMKRQFGRLHSLPWKSWQKMNSPLIiTPKLDTSLMSGWRYRQR 
LPRLHTFLKKSLQMASELAPPLPTPAPlASSIiPPPPGPPPLLPV 
PtiA*LSRSGILVPPNSGFSLSC\PLGDH*GSSGEVRGSCGSPPP 
HHCWVLPPPP*LLLPPR 


7033 


68S 


815 


RSRDGLSSSATSNRARRSKCSGPKRATPLDSGPGP*APPGPSSA 



1 

589 



wo 01/53312 



PCT/USOO/34263 



SEQ 
ID 
NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acxd segment containing signal peptide 
(A=Alanine, C^Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=: Phenyl alanine. G=rGlycine, 
H=Histidine, I-Isoleucine, K^^Lysine, 
Ls^Leucine, Mr^Methionine, K=Asparagine , 
P=Proline, Q=Glutaraine, R^Arginine, 
S=Serine, Tt=Threonine, V=Valine, 
W=Tryphophan, Y-Tyrosine, X-Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) ^ 








LMMPSSCPWRTGALGPSPAGSRAIiGRCTSSVGPGSRWLTRTSSP 
GCATRTWRTMRMEPRPLRSRMGESAPGIPAELPSAAPSGPSAPS 
AAAPSAPTTPAAAGPNTL*SRRTAEWCMPPSCSCCWGWC*SWSA 
WDWRRPPLQVSPAPSSSCRASCCWCLESIT*SSSTARSRATGAS 
SSSTCPTSRSDRGAAWTPVSPMGAPIaLPCSVPLISREEALQDPR 
NPS P *GVCSGS SGHAG1AU5KPP VACS VP 


7034 


92 


1942 


EDTSSMPFRLl*! PU3I)I,aVLrjPQHHGAPGPDGSAPDPAHYRERV 
KAMFYHAYDSYLENAFPFDELRPLTCDGHDTfrJGSFSLTLIDALD 
TLL\TLFYFQI LGNVSE FQRWEVLQDSVDPD 1 DVNASVFETNI 
RWGGLLSAHIiLSKKAGVEVEAGWPCSGPI,LRMAEEAARKIiL*PA 
FQTPTGMPYGTVNLLHGVHPGETPVTCTAGIGTFIVEFATLSSL 
TGDPVFEDVARVALMRLWESRSDIGLVGNHIDVLTOKWVAQDAG 
IGAGVDSYFEYLVKGAIIiLQDKKLMAMFLEYNKAIRNYTRFDDW 
YliWVQMYKGTVSMPVFQSLEAYWPGLQSIjIGDIDNAMRTFLNYY 
TVWKQFGGLPEFYNI PQGYTVEKREG YPLRPELI BSAMYLYRAT 
GDPTLLELGRDAVESIEKISKVECGFATIKDLRDHKLDNRMESP 
FLAETVfCYLYLLPDPTNFIHNWGSTFDAVITPYGECIXiGAGGyi 
FNTEAHPIDPAALHCCQRLKEEQWEVEDLMREFYSLKRSRSKFQ 
KNTVS SGP WEP PARPGTLPS PENHDQARERKPAKQKVPIiLS CPS 

QPFTS KIALLGQVFIiDSS * PLDNFFI FI FIiRLNYKKLLr*AI I KK 
K 


7035 


92 


1942 


edtssmpfriiliplgllcallpqhhgapgpdgsapdpahyrerv 
kamfyhaydsylenafpfbelrpltcdghdtwgsfsltlidau:) 
tllXtlpypqilgkvseforvvevlqdsvdfdidvnasvfetni 
rwgglls7vhllskkagvevbagwpcsgpllrmaeeaarki.iipa 

FQTPTGMPYGTVNIiHGVNPGETPVTCTAGIGTFIVEFATLSSl. 

TGDPVFEDVARVAIiMRLWESRSDlGIiVGNHIDVLTGKWVAQDAG"-' 

laAGVDSYFEYLVKGAILLQDKKLMAMFLEYNKAIRNYTRFDDW. 

YLWVOMYKGTVSMPVFQSLEAYWPGLQSLrGDIDNAMRTFLNYY 

TVWKQFGGLPEFYNIPQGYTVEKREGYPLRPELIESAMYLYRAT 

GDPTIiLELGRDAVES lEKIS KVECGPATIICDIiRDHKLDNRMESF 

FtiAETVKYLYLLFDPTNPIHNKGSTFDAVrrPYGECILGAGGyi 

FNTEAHPIDPAALHCCQRLKEEQWEVEDLMREPYSLKRSRSKFQ 

KNTVSSGPKEPPARPGTLPSPENHDQARERKPAKQKVPLLSCPS 

QPFTSKIAIJLGQVFIiDSS*PLDNFFlFIFXiRUryWiajIj^ 

K 


7036 


442 


761 


CLiAPIiFSCFQIINLHIiAPSGRt»RWAKLiRGPQRN*I>PGEGPSIPT 
RNW*ERKAGCSQPC/ PAQQHHGRPPGVS PLiPRDPHPTTIiRPLPP 
PPPPPPPPPRRPPRNRRPG 


7037 


442 


761 


CU^LFS CFQI INL,HLAPSGR1iRWAWIjRGPGRN*LPGEGPS I PT 
RNW* ERKAGCSQPC/PAQQHHGiiPPGVS PliPRDPHPTTLRPItP P 
PPPPPPPPPRRPPRNRRPG 


7038 


ISS 


891 


GAGAASDMSSGLRAADFPRWKRHISEQIiRRRBRIiOROAFEEI 11, 
QYNKLLEKSDLHSVIiAQKXQAElCHDVPNRHEISPGHDGTMNDNQ 
LQEMAQLRI KHQEEIiTELHKKRGELAQ \RVIDLNNQMQRKDREM 
QMNEAKIAECLQTISDLETECIiDLRTKLCDLERANQTLKDBYDA 
I^ITFTALEGKLRKTTEBNQEZ.VTRWMAEKAQEAIfRLNARE*KR 
LQEAASPAAERACRSSKGTSTSRTG 


7039 


155 


891 


GAGAASDMSSGLRAABFPRWKRHISEQIiRRRDRliQRQAFEEIII* 
QYNKLLEKSDIHSVLAQKLQAEKHDVPNRHEISPGHDGTWNDNQ 
LQEMAQLRIKHQEELTELHKKRGEIAQXRVIDLNNQMQRKDREM 
QMNEAKIAECLQTISDLETECLDLRTKLCDLERANQTLKDEYDA 
LQITFTALEGKLRKTTEENQELVTRWMAEKAQEANRLNARE*KR 
LQEAASPAAERACRSSKGTSTSRTG 


7040 


34 


789 


KITPPRRPHRCSSGHGSDNSSVLSGEIjPPAMGKTALFYHSGGSS 
GYES VMRDS EATGSASSAQDSTS ENSSS VGGRCRSXjKTPKKRSK 



590 



^0<^''^^'^^ PCT/USOO/34263 



SEQ 
ID 
MO; 



7041 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



567 



Amino acxa segment containing signal pe^tlH^ 
(A=Alanine, C:=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F^Phenylalanine, G^Glycine 
H=Histidine, I=Isoleucine. K«Lysine, 
L=Leucine, M=Methionine, N=Asparagine 
P=Proline, Q=:Glutamine, R-Arginine, 
S-Serine, TxThreonine, V==Valine, 
W==Tryptophan, V=Tyrosine, Xx^Unknown, *=.Stop 
Codon, /-possible nucleotide deletion. 



\=possible nucleotide inserti on) v 

^GSQRRRl,IPALSl.DTSSPVKKPPNSTGVHHVDGPbRSSPKGi;G 
EPFEIKVYEIODVERLQRRRGGASKEAMCFNAKLKILEHRQORI 
AEVRAKYEWLMKELEATKQYLMLDPNKWLSEFDLEQVWELDSLE 
YLEALECVTERLESRyNFCKAHLMMIT CFDlT 

;>^KVAMCjRRRAPAGGS UjKALt4RHQTQRSRSHRHTDSWLHTSEL 

NDGYDWGRLNLQSVTEQSSLDDFLATAELAGTEFVAEKLNIKFV 

PAEARTGLLSFEESQRIKKI^EENKQFLCZPRRPNWNQNTTPEE 

LKQAEKDKFLEWRRQL\VRLEEEQKL1LTPFERNLDFWRQLWRV 
lERSD IWQIVDA 

PIHMAWbKAUl \ISPLFPHXQGYLLLSASHG \ATSLHTKGAI. 
PLETVTMYTVIPKSKYVLVKPDTQyPYSENLDEFKRLAENSASN 
DDLLMAEVArSDYGDKLTLELRBKY 



7044 



276 



734 



7045 



513 



7046 



513 



7047 



103 



466 



ARGMAARDSuaEEDX,VSYGTGL£?LEE GKRPKKPiPLODCyrVRD 
EKGRYKRFHGAFSGGFSAGYFNTVGSKBGWTPSTFVSSRQNRAD 
KSVLGPEDFMDEEDLSEFGIAPKAIVTTDDFASKTKDRIREKAR 
OtAAATAPIPGATI,I>DDLITPAKLSVGFELLRKMGWKEGQC?VGP 
RVKRRPRRQKPDPGVKIYGCALPPGSSEGSEGEDDDYLPDNVTF 
APKDVTPVDFTPKDNVHGLAYKGLDPHQALFGTSGEHFNLFSGG 
SERAGDLGEIGLNKGRKLGISGQAFGVGALEEEDDDIYATETLS 
KYDTVLKDEEPGDGLYGWTAPRQYIOsrQKESEKDIiRyVGKILDGF 
SUVSKPLSSKKIYPPPELPRDYRPVHYFRPMVAATSENSHLLQV 
liSESAGKATPDPGTHSKHQLNASKRAELLGETPIQGSATSVLEF 
LSOKDKERIKEMKQATDLKAAQLKARSLAQNAQSSRAQPSPAAA 
AGHCSWNMALGGGTATLKASNPKPFAKDPEKQKRYDEFLVHMKQ 
GQKDALERCLDPSMTEWERGRERDEFARAALLYASSHSTLSSRF 
THAKEEDIfSDQVEVPRDQENDVGDKQSAVKMKMFGKLTRDTFEW 
HPDKLLFO/RLVGLPRVKRDKYSVFNFLTLPETASLPTTOASSE^- 

KVSQHRGPDKSRKPSRWDTSKHEKKEDSISEFI.RrARSKAEPPK 
QQSSPLVNKEEEHAPELSAN 



E\nri.TDEFAKGRKVAJ3LYEI.VQYAGWI IPRLYX.LITVGVVYVKS ' 
FPQSRKDILKDLVEMCRGVQHPLRGLFLRNYIJWrrRNXLPDEG 
EPTDEETTGDISDSMDFVLLNFAEMNKLWVRMQHQGHSRDREKR 
ERERQELRILVGimVRLgQV 



LGFKMEALSRAGQEMSIiAALKQHDPYI TSIADLTGQVALYTFCP 
KANQWEKTDIEGTLPVYRRSASPYHGFTXVKRLNMHNbVEPVNK 
DLEFQLHEPFLLYRNASLSIYSXWFYDKNDCHRIAKLMADWEE 
ETRRSQQA/RSGQTESQPGQWLQRPQAHRHPGDAEQSQG 



U>FKf4EALSKAGQEMSIAALKQHDPriT SIADLTGQVALYYFCP 
KANQWEKTDIEGTLFVYRRSASPYHGFTIVNRLNMHNLVEPVNK 
DLEPQLHEPFX^LYRNASLSIYSIWFYDKNDCHRIAKLMADWEE 
ETRRSQQA/RSGQTES0PGQWLQRPQ7\HRH PGDASQSQG 
QMKX EKCGWScc^IjTS 1 KGNCHNFYTAISK DVTYKELKNLLNSKN 
IMLIDVREIWElLEYQKlPESINVPIiDEVGEALQMNPRDFKEKY 
NEVKPSKS DS/lVFSYIAGVRSKKALDTATgrriPWQWPP 



7049 



70SO 



393 



938 



538 



FFCLTLLSSWUYRHHATRRV ISSPVFTMEDSGKTFSSEEEEANY " 
WKDLAMTYKQRAENTQEELREFQEGSREYEAELETQLQQIETRN 
RDI*LSENNRLRMELETIKEKFEVQHSEGYRQISALEDDIAQTKA 

ikdqlokyireleqanddlerakratdhglsktfeXqrlnXqai 



EKKW 



KHTGSASYGGt't^i^lii^GPA TXASVAGRCSSVGKIPARRCYEDEL 

vpvfeavgriyelrlmmdfdgknrgyafvmychkheakravhei. 

NNYEIRPGRLLGVCCSVDNCRLPIGGIPKMKKREEILEEIAKVT 

EGVLDVIVYASAADKMKNRGLRLRGVREPPRGCHWLGRKIiIAWX 
ASSLWG 

KRTGSASYGGPPPGLGGPATXASVAGRCSSVGKIPARRCYEDEL 



591 



wo 01/53312 



PCT/US00/342<;3 



SEQ 
ID 
WO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ama.no acxa segment containing signal peptidg'^ 
(A=Alanine, C-Cysteine, D=.Aspartic AcidT E« 
Gluhainic Acid P=Pln*»'n^/i -^i .* 
v-^vjuaiiiit, r-f nenyialanme, G-Glycine, 

H=Histidine, I^Isoleuclne, Ks=Lysine, 

Ij~ Leucine . MssM*^ t* h -t <-kn >it«._». 

"^wv-j-iic;, i »— i »t= i-tiioni,ne, wj=Asparagine 

P=Proline, Q-Glutamine, R=Arginine, ' 

S— Serine* T^Threonine, VwValine * 

W=.Tryptophan, Y-Tyroeiney X=UnkAown, *«Stop 

Codon, /-possible nucleotide deletion, ' 

\=p03aible nucleotide insertion) C 


7051 






vi-Vb-fc;AVGKXY£LRLMMDBl)OKNRGYAFVMYCHKH£AJCRAVRgir" 

NITYEIRPGRLLGVCCSVDNCRLFIGGIPKMKKREEILEEIAKVT 

EGVLDVIVYASAADKMKNRGLRLRGVREPPRGCatWLGRKLlAWX 
ASSLWG 


7052 


119 


816 


i^fu^HLM^U=.X CUWAKK^Rh )£hLUati XUSSMVYYQGVMQQIQRHCQS 
VRDPAXKGKWQQVRQELLEEYEQVKSIVGTLESFKIDKPPDfPV 
SCQDEPFRDPAVWPPPVPAEHRAPPQIRR/RQSRSKTSEERNGR 
SRSPGTCRPSrXPISKSEKPSTSRDKDYRARGRDDKGRKNMQDG 
ASDGEMPKFIXSAGYDKDLVEALERDIVSRNPSIHWDDIADLEEA 


70S3 


467 


71S 


bCiPGRGKMSKlaLNPEEMTSRDYYFDSYAHFGlHEEMLKDEVRTL 
TYRNSMYHNKHVFKDKVVLDVCSGTGILSMFAAROGPRR 


7054 


467 


715 


SCPQRQKMSKLLNPEEMTSRDYYFDSYAHFGIHEEMLKDEVRTir" 
TYRNSMYHNKHVFKDKVVLDVGSGTGILSMPAARQGPRR 


7055 


1 


1036 


GTSQRSRETUARRRSAGAKPTARLPWPAAIifeEWPSCPCEPliGPG 
RRCRWDAMEYDEKLARFRQAHLMPFNKQSGPRQHEQQPGEEVPD 
VTPEEALPELPPGEPEFRCPERVMDLGLSEDHFSRPVQLFIA5D 
VQQLRQAIEECKQVILELPEQSEKQKDAWRLIHLRLKLQELKD 
PNEDEPWIRVLLEHRFYfCEKSKSVKQTCDKCNTIIWGLIQTWYT 
CTGCYYRCHSKCLNLISKPCVSSKVSHQAEYELNICPETGLDSQ 
DYRCAECRAPI/CS/DGWPSEARQCDYTGOYYCSHCHWNDLAV 
IPARWHNWDFEPRKVSRCSMRYIAIiMVSRPVLRLREIK 


705S 


2 


527 


Ui^KKVfaWRSWI^t/wuiUlJjCXjFIWLSMNVLLPMICTFIiLYNQGP " 
EYHYLHQMLG/ALCLSRASASVLNLNCSLIIJLiPMCRTLrAYLRG 
SQKVPSRRTRRLLDKSRTPHITCGATICIFSGVHVAAHLVKALN 
FSVNYSEDFVELNAARYRDEDPRKIJ^rrTVPGLTGVC^lEVVIiF;. 


7057 


2 


527 


EYHYLHQMLG/AbCLSRASASVI^LMCSLlLLPMCRTLIAYLRG 
SQKVPSRRTRRLLDKSRTFHXTCXEATI CIPSGVHVAAHr,VWAlxN 

FSVNYSEDFVELNAARYRDEDPRKLLPTTVPGLTaVCMSVVLFL 
M 


70S8 


1368 


431 


GIYLHVNBKlPRPTCIGDRQENDKENLNLENHRDQELLHASCQA 
SGEVPSQASIiRGFFrEDEPGCFGEGEWI^PEALQNIQDEGTGEQL 
SPQERISEKQLGQHLPNPHSGEMSTMWLEEKRETSQKGQPRAPM 
AQKLPTCRECGKTFYRNSQLI FHORTHTGPTVPnfn'T r'tfvn vi o 

SSDFVKHQRraTGEKPCKCDYCGKGFSDFSGrJUiKEKXHTGEiCP 
YKCPICEKSPIQRSNFNRHQRVHTGEKPYKCSHOGKSFSWSSSL 
^^Q^^^^^^I^FQ^PVTKLSPPISISQPSHKNTQLHOEEIiCIiR - 


7053 


1 


469 


fSGFGAVPDAliGCRMSDLRITEAFLYMDYLCFRALCCKGPPPSR" 
PEYDLVCIGriTGSGKTSLLSKLCSESPDNWSTTGFSIKAVPPQ 
NAILNVKELGGADMIRKYWSRYYQGSQGVIPVLDSASSEDDLEA 
ARN*SCTQLLQHPQLCTLPPLIUV 


7060 


1 


1178 


WPAFPRQPAAAAMUAIil/STGPRRARGCLGAAGPTSSGRAARtPA 
APWARPSAWLECVCWrFDr,EI,GQAI.ELVYPNDFRXiTDKEKSSI 
CYLSFPDSHSGCLGDTOFSFRMRQCGGQRSPWHADDRHYWSRAP 
VALQREPAHYFGYVYFRQVKDSSVKRGYFQKSLVLVSRLPFVRL 
FQALbSLIAPEYFDKLAPCLBAVCSEIDQttPAPAPGQTLNLPVM 
GWVOVRIPSRVDKSESSPPKQFDQENLLPAPWIASVHELDLF 
RCPRPVLTHMQTLWELMLLGEPLLVLAPSPDVSSEMVLALTSCL 
2PLRFCCDPRPYFTIHDSEFKEFTTRTQAPPNVVLGVTNPFFIK 
rLQHWPHILRVGEPKMSGDLPKQVKLKKPPK:v*RPWDTKP 




30 


1670 


SVNLPPSLWPWEEAMDSTKSEPLKGSPEAEDGNrEYKKLVNPSQ 
SfRFEHLVTQMKWRLQEGRGEAVYQiGVEDNGLLVGLAEEEMRAS 



592 



wo 01/53312 



PCT/USOO/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 


Amino acid Gegment containing signal peptidi™ 
(A=Alanine. C= Cysteine, D*tA£partic Acid, 
Glutamic Acid, F«=Phenylalanine. G=Glycine, 
H=Histidine, I«Isoleucine, K=ljysine, 
L=Leucine, MsMethionine, N=Asparagine , 
P==Proline, Q=Glutamine, R=Arginine, 
S=Serine, T- Threonine, V»Valine, ' 
W-Tryptophan, Y=.Tyrosine, X»Unknown, *=StQp 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) v 








i.itri.HKMAEKVGADIT\njREREVDyDSDMPRKITEVLVRK\^DN" 

OQFLDLRVAVLGNVDSGKSTLLGVLTQGELDNGRGRARLNLFRH 

LHEIQSGRTSSISFEILGFNSKGEVHGINGTQWGQTLRMGW*** 

RT*DGGRVWRLFE1V*MNALRGL*TSSAPLRKSMGNQI.N*IKNG 

VKIKRQGHPGNGLGPGMSEGVGRAGRRH*GPWALGQVVNYSDSR 

TAEEICESSSKMITFIDLAGHHKYLKTTIFGLTSYCPDCALLLV 

SANTGIAGTTREHLGIJUALKVPFFIVVSKIDLCAKTTVERTVR 

QIiERVLKQPGCHKVPMLVTSEDDAVTAAQQFAQS PNVTPI FTLS 

SVSGESLDLI,KVFLNILPPLTNSKEQEELMCK3LTEFQVDEXYTV 

PEVGTWGGTLSR*IDLIATLPTQPSPIYSKTSWPKGGDPGI 


7061 


364 


710 


MMPSPI^PPCLPVMDPEi^LEEPETARLRFRGPCyQEV/WSPRE 
ALARLRELCCQWLQPEAHSKEQMLEMLVIiEQFLGTLPPEIQAWV 
RGQRPGSPEEAAAIiVBQLQHDP*ARMPSPIK3PPCLPVMDPETTL 
EEPETARI,RFRGFCYQEVAGPREALARI>RELCCQWLQPEAHSKE 
QMLSMLVLEQFLGTIiPPEIOAWVRGQRPGSPEEAAALVEGI^JHD 
PGQLLG 


7062 


71 


744 


AKAGTNLERJjHWbSYFFCIPKHKLKSSQKDICVRQFKACTQAGER 
TAIYCLTQNEWRLDEATDSFFQNPDSI.HRESMRNAVDKKKLERI. 
yGRYfa3PCDENKIGVDGXQQFCDDLSLDPASISVLVIAWKFRAA 
TQCEFSRKEFLDGMTELaCDSMEKLKALLPRLEQEl.KDTAKFKD 
FyOPTFTFAKNPGQKGLDL*MAGAYWKriVLSGRFKriiYLWNTFL 

mm 


7063 


2 


562 


LRTVPDLPGRRFRAMRTGQRR*PELPPDMNSLEQAEDI*KAFERR 

LTEYIHCI^PATGRWRMLLIWSVCTATGAWNWLIDPETQKVSF 

FTSLWNHPFFTlSCI'TLIGLFFAGIHKRVVAPSriAARCRTVLA 

EYNMSCDDTGKLlLKPRPFrVQ*QSSI,IVMGLKIAFLRlSDTAKS ' 
HKGFLIiRLDM . K 


7064 


300 


864 


RDTGSDPSSTRRr.f-SVCCTGH*PAEPrASPHPSRGTCPPASSAS 
SRRTGCWTCPPESGHAQARRSRRASASRWGARGAVRSAVAARGC 
SSRAGRWLETPGRRRGPPACAAAAGRLRGPAP*AAPPTASVPAR 
(-ttu f AAK I OAfc'AAAl WtiRRRLSGLRAPAuGRRRS PGPS PKSAAP 
PLLTPLGAGRAGGSRANS 


7065 


1 


555 


ATTTHSARRSGRGAAAEAAASAAGGRQKGPDRKAWEGRRTTPGG 
RSQSEPKAPPPOKRSEAAFASMAHSPVAVQVPGMQNNIADPEEL 
FTKLERIGKGSFGEVFKGIDNRTQQWAIKIIDLBEAEDEIEDI 

RAGPFDEFQ 


7066 


3 56 


676 


PGPQRGPWRAREGGHPLDPADHPRAPASLRSXrVRAATMMQIGDT 
YNQKHSLFNAMNRFIGAVNm^DQTVMVPSLLRDVPLADPGLDND 
VGVEVGGSGGCLEERTPP 


7067 


152 


973 


KENITMATEIGSPPRFFHMPRFQHQAPRQLFYKRPDFAQQQAMQ 
QLTFDGKRMRKAVNRKTIDYNPSVIKYLENRIWQRDQRDMRArQ 
PDAGYYWDLVPPIGMLNNPMNAVTTKFVRTSTNKVKCPVFWRW 
TPEGRRLVTGASSGEFTIiWNGLTFNFETILOAHDSPVIiAMTWSH 
NDMWMLTADHGGYVKYWQSNMNNVKMFQAHKEAIREAUFIHNIP 
FSWPIVMVKLFSKCILGAEMHGLCQFLGNFLHPINTIFFFVFT 
HSPFCWAPF 


7068 
7069 


222 


816 


DTMKEYVriLLFIJU:iCSAKPFFSPSHIALKNNm.KDMEDTDDb^^ 
DDDDDDDDDDEDNSLFPTREPRSHPPPFDLFPMCPFGCQCYSRV 
VHCSDLGLTS VPTNIPFDTRMLDLQWNKI KEI KEWDFKGLTSLY 
GLILNNNKLTKIHPKAFLTTKKLRRLYLSHNQLSEIPLNLPKSI* 
AELRIHENKVKKXQKDTPKKK 




1147 

... _ 


1765 


FRDHRRYFYVNEQSGESQWEFPDGEEEEEESQAQENRDETLAKQ 
rUCDKTGTDSNSTESSETSTGSLCKESFSGQVSSSSIjMPLTPFW 
rLLQSNVPVLQPPIiPIjEMPPPPPPPFESPPPPPPppPAPKMPPP 



593 



N'SDOCID <rWO.,^. 015331 2Al .,!,> 



wo 01/53312 



PCT/US00/342<;3 



1 SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


/miao acid segn^ent containing signar^iHtl^f^ 
{A=Alanxne, C=Cysteine, D^Aepartic Acid! 
Glutamxc Acid, F.PhenylalaniL, G=Glyci;e!' 
H=Histidane, I^lsoleucine, K=I,ysine/ 
L=Leucine. M=Methionine, N=Asparagine 
P^Proline, Q=Glutamine, R=Arginine 
S=Serine^ T=Threonin^» v-van«« ' ' 

Codon, /=possible nucleotide deletion 
\=possible nucleotide insertion) ' C 


7070 






RVSTAQKR IEEWKQQQLVS6MAERNANFBA '=^"^*^^fcED 


707l 


1 


547 




1 

7072 


2 


921 


ARGTLKALrt: lAKKVGK vuANGQKAAGPSADSVTENKIGSPPKTP 
VSNVAATSAGPSNVGTELKSVPOKSSPFLTRVPAypPHSENXQY 
FQDPRTQIPFEVPQYPQTGYYPPPPTVPAGVAPCVPRFVRSKNV 
PESSI.PPASMPYADHYSTFSPRDRMNSSPYQPPPPQPYGPVPPV 

' •^*^"*^^PQWAQYHTQKAPLVSSTLPVATQSPTPPSTLMRGEg<; 


7073 


2 


921 


ARU'iLKAiaKTAKKVGKVGANGUKAAGPSADSVTEWKIGSPPKTP ' 
VSNVAATSAGPSNVGTELNSVPQKSSPFLTRVPAYPPHSEWIOY 

i't.ii^ljPPASMPYADHySTFSPRDRMWSSPYQPPPPQPYGPVPPV 
PSGMYAPVYDSRRIWRPPMYQRDDIIRSNSLPPMDVMHSSVYQT 
SLRERYNSLDGYYSVACQPPSEPRTTVPLPREPCJGHMCTSCEEQ 
lRRKPDQWAQYHTQKAPLVSSTLPVATOSPTPPSTr.MP«p.«c 


7074 


50 


S04 


^^at-OVSDFPAPAAAPAHTLTSFSGSLsFuVRKPLGRAPAMP ■ 
LVRYRKWILGYRCVGKTSLAHQFVEGEFSEGYDPTVENTYSKI 
^s^tuc nijtiu vu I AGQDE YS ILP YS FI IGVHGYVLVYSVl'SL 
HSFQVI BSLYQKLHEGHGK 


7075 


263 


1003 


VCPVLCi>-i'KUKPGH5SLVTYFGKPTIUZKEF£j:GHC^AAGK?^ 

\^LETNYAELVlJ>VGRVTIX3ENSRKKMKI>CKr;RKKQNERVSRA^^ 

CALLNSGGqyiKAEIENBDYSYTKDGIGLDLENSFSNILXiFVPE 

YliDFMQrTGNYPLrrVKSWSLNTSGLRITTLSSNLYICRDITSAKV 

MmTAALEFLKDMIOCTRGRLYLRPELIAKRPRVDIQEEtmMKAL 

AGVFFDRTELiDRKEKLrFTESTHVEI 


7076 


598 


1005 


NYINt i. t KKh ^ PPHV;3KVEINPVRLSRLQGVEklMKKTEESESQ " 
VEPEIKRKVQQKRHCSTYOPTPPLSPAfilfK'rr tkt fnr^rjM/>nX 
^^^^S^^^^L^RTSIHQNSGGQKSQm'GLTTKKFYGmiVEKVP 


7077 


279 


1049 


LQSESSKAAEGNEQRHi;!DEQRSKRGGWSKGRKT?yK-p"T pncawa Pb- 
SPLTGYVRF^3NERREQLRAKRPEVPFPE1TRMI.GNEWSKLPPEE 
KQRYLDEADRDKERYMKELEQYQKTEAYKVFSRKTQDRQKGKSH 
KyuAARQATHDHEKETEVKERS VPDIPI FTEEFLNHSKAREAEIt 
ROLRKSWMEFEERWAALQKHVESMRTAVEKLEVDVIQERSRWTV 
LQQHLETLRQVLTSSPASMPLPEXGETPTVDTIDSYM 


7078 


3 


X119 

: 


bSMGSNSEINGlALRKTDKYGFLGGSQYSGSLkssiPVDVARQR 
ELKWLDMPSNWDKWLSRRFQKVKLRCRKGIPSSLRAKAWQYLSN 
SKELtEQNPRKFEELERAPGDPKWLDVIEKDLHRQFPFHEMPAA 
RGGHGCX>DLYRILKAYTIYRPDEGYCQAQAPVAAVI,LMHMPAEQ 
\PHCLVQICDKYLPGYYSAGLEA1QLDGEIFFALLHRASPLAHR 
HLRRQRIDPVLYMTBMFMCIFARTLPWASVLRVWDMFFCEGVKI 
rFRVALVLLRHTLGSVEKLRSCQGMYETMEQLRNLPQQCMOBDF 
tiVHEVTNLPVTEALIERENAAQLKKWRETRGELQYRPSRRLHGS 
yVXHEERRRQQPPLGPSSS 


7079 


483 


767 ] 
y 

c 


^qgqrmageqkpssnlleqfillakgtsgsaltalisqvleapg 

m^GELLErANVQELAEGAKAAYLQIiriNLPAYGTYPDYIANKE 
SLPELY 




2 


376 < 

^ 


^wefkrpkepsgsdgesdgpidvgqegqlsqmarplstpsssq ■ 

lOARKKRRGIlEKRRRDRINSSLSELRRLVPTAFEKQGSSKLEK 



594 



wo 01/53312 



PCT/USOO/34263 



SEQ 
ID 
NO; 


fredicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue o£ 
amino acid 
sequence 


PreOicced end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


. segment containing signal peptide 
{A=Alanxne C.Cy.teine, D.AspartL Acidf 
S^^^^:?^^.''^^^' F=Phenylalanine, G=Glycine. 
H-Hxstidme, I=Isoleucine, K=LYsine 
L=Leucine, M=Methionine, N=AsparagiAe 
P-Proline, Q=Glutamine. R==Arginine, ' 
S=Serine, T=Threonine, V=Valine 
^.Tryptophan, Y=Tyrosine, X=.0nkAown, *.Stop 
Codon. /-possible nucleotide deletion, 
^ nucxeotiae insertion) * 


7080 
7081 


200 


595 


AEVLQMTV-DHLKMLHATGGTGTHALLFQASFIQQI F ~- 

VQLPLEAP^L.5L,LSCRDH6GUMKULSRRHRDCRVYGSPQDGIPY- 

SGDLLFLSGCGEFPRKREELGEEGETEVRAArVPWRAT.IfP • 


7082 


213 




AVTKEEMlU^-;i^LcjYHNKLIIAPWVRVGTLPMRL 


7083 


3 

115 


1137 


APS J^NTMUviA WCRGP VLLCLRQGLGTNS FLHGLGQKPFEGAR^I,-- 
CCRSSPRDLRDGEREHEAAQRKAPGAESCPSLPLSISDIGTGCL 
SSLENLRLPTLREESSPRELEDSSGDQGRCGPTHQGSEDPSMX,S 
QAQSATEVEERHVSPSCSTSRERPPQAGELILAETGEGETKPKK 

aledywi^krgtaitfpkdinmii^mmdinpgdt^ 

veewpdnvdfihkdisgatediksltfdavaldmlnphvtlpvf 
yphlkhggvcpvywnitqvielld 


7084 




541 


spkaprarpcrvstadrsvrkgimaysledlllkvrdtlmladk 


7085 


3 


522 


viotkapaevqitaeqllreakkkelellppppqqkitdeeelnd 
yklrkrktfednirknrtvisnwikyaqweeslkeiqrarsiye 

RALDVDYRNlTLWLKYAEMEMKNROVNHAimiwnRaT'TTT. 


708S 


243 


1499 


yavgnhdpieaykoqtvivqsflrafqahkeenwalpvmyaval' 

DLRVFANNADQQbVKKGKSKVGDMLEKAAELUISCFRVCASDTR 
AGlEDSKKWGMLFLVNQLFKIYPKlNKLHLCKPblRAIDSSNLK 

NLI»LLHEALAKHEAFFIRCGIFLILEKLKIITYRNI,PKKVYI,Dr, 
o^!!Si;^^^^^^^^^^^°^^»SVQCIIAWI,IYMGHVKGYI 
SHQHQKLWS KQWPFPPLSTGC 


7087 


256 


525 


lIAAR^iL,K*JNfaJU.KPKVMQDLl^SU^l>t^^^i'KHEXOEWYKGFXlRDCP~ 

SGHLSMEEFKKIYGNFFPYGDASKFAEHVFRTFDANGDGTIDFR 
EF 


7088 


166 
104 


723 


VTERIIAVSPPSTANEENFRSNLREVAQMLKSKHGGWYLLFNLS 

ERRPDrTKLHAICVI*EFGWPDLHTPALEKICSICKAMDTWI,NAHP 
3RCRVLHNKG 


7089 




759 

) 


i^iS^^^^^^^^^^^^'^^'^^^^^SYVGLVYMFNJLIVGTGALT 
^f^^3^"^^^^^^^^^^^'^'^^IEAMAAANAQt,HW 
KRMENLKEEEDDDSSTASDSDVLIRDNYERAEKRPILSVQRRGS 
PNPFEITDRVEMGQMASMFFNKVGVNLPYFCIIVYLYGDIAXYA 
\AVPFSLMQVTCSATGNDSCGVEADTKYNDTDRCtfGPT,Pt7Tm 




33 


1775 : 

C 

K 
N 


wuwxuUKrLKARNEESPLSRAPSRGGVNFU^ARTYlPNTKVEC" 

lYTLPPG'rMPSASDWIGIFKVEAACVRDYHTFVWSSVPESTTDG 

;pSnrrvi^^'''^'''^''^^^^^^^^<^^«5QSPPFQ^ 

r^^fi: ^^^^^^^^^^VVPKATVLQNQIJ>ESQQERNDLM 

J:^SJ;^^^'^^^^^^^^^^E'f^^TARQEHTELMEQYKQISRS 

GEITEERDILSRQQGDHVARILELEDDIQTISEKVLTKBVELD 

iRDTVKALTREQEKLLGQLKEVQADKEQSEAELQVAQQENHHXi 

I^LKEAKSWQEEQSAQAQRLKDKVAQMKDTLGQAQQRVAELEP 



595 



wo 01/53312 



PCTAJSOO/34263 



10 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid eegrnent containing oignai" peptide" 
(A« Alanine, C^Cyateine, P-Aspartic Acid. E= 
Glutamic Acid, F=:Pbenylalanine, G=Glycine, 
H^Hxstidine, I^Isoleucine, K=*I.ysine, 
Ii= Leucine, M=^Methionine, Ns^^Asparagine , 
P=Prollne, Q-Glutamine, R**Arginine, 
S=Serine, T=Threonine, Valine, 
W=.Tryptophan, Y-Tyrosine, X-Unlcnown, *-Stop 
Codon, /-.possible nucleotide deletion, 
\=.po33ible nucleotide insertion) > 








LKEQbRGAQELJ^SQQKATLIiGEEtASAAAARDRTlAEbHRSR 

LKVAEVNGKIiAELGLHLKEEKCQWSKERAGLLQSVEAEKDKlLK 

liSAEILRLEKAVQEERTQNQVFKTEUAREKDSSLVQLSESKREfi 

TELRSALRVLQKEKEQLQEEKQELLEYMRKLEARLEKVADEKWN 

EDATTEDEEAAVGLSCPAALTDSEDESPEDMRLHPMAPVSVETO 
ASLLIiGLE 


7090 


33 


1775 


SVCWEORYLKARMEESPLSRAPSRGGVNFLm^UilYlPWXKV^^ 

HYTLPPGTMPSA^DWIGIFKVEAACVRDYHTFVWSSVPESTTDG 

SPIHTSVQFQASYLPKPGAQLYQFRYVNRQGQVCGQSPPFQFRE 

PRPMDELVTLEEADGGSDILLWPKATVLQNQLDESQQERNDLM 

QLKLQLEGQVTELRSRVQELERAIATARQEHTELMEQYKGISRS 

HGEITEERDILSRQQGDHVARILELEDDIQTlSEKVIiTKEVELD 

RLRDTVKALTREQEKLLGQLKEVQADKEQSEAELQVAQQENHHL 

NLDLKEAKSWQEEQSAQAQRLKDKVAQMKDTLGQAQQRVAEIiEP 

LKBQLRGAQELAASSQOKATLtGEBLASAAAARDRTIAEIiHRSR 

LEVAEVNGKIiAEIiGLHLKEEKCQWSKERAGLLQSVEAEKDKILK 

LSAEILRLEKAVQEERTQNQVFKTELAREKDSSLVQI*SESiCREL 

TELRSALRVLQKSKEQLQEEKQELLEYMRKLEARLEKVADEKWN 

EDATTEOEBAAVGLSCPAALTDSEDESPEDMRIiHPMAFVSVETQ 

ASLLLGLE 


7091 


186 


1076 


EGMLTREHRCGRSEEQEl*EPWPSPKKARSGRWIiRNGFKRKMEEP 
EEPADSGQSLVPVYIYSPEYVSMCDSr^KIPKRASMVHSIjIEAy 
ALHKQMRIVKPKVASMEEMATFHTDAYIiQHLQKVSQEGDDDHPD 
SIEYGLGYDCPATEGIFDYAAAIGGATITAAQCLIDGMCKVAXN 
WSGGWHHAKKDEASGFCYLNDAVLGrLRLRRKFBRIIiYVDLDLH 
HGDGVEDAFSPTSKVMTVSLHKFSPGPPPGTGDVSDVGLGKGRy 
YSVNVPIQDGIODEKYYQICERYEPPAPNPGL 


7092 


522 


609 


kqginedqeesqkprlgegcepiskrqmkklikqkqweeqreijR 

KQKRKBKRKKiKKIiERQCQMEPNSDGHDRKRVRRDWHSTLRLri 
DCSFDXIiM 


7093 


454 


655 


NFGVSGVELAQQASMVRMS FVIAACQLVLGLLMTSLTE'SS IQNS 
ECPQLCVCE I RPWFTPQST YREA 


7094 


2 


508 


fvrsmhwgvgfassrpcvvdlswnqsispfgwwag'seepfsf^g' 

DlIAPPLQDYGGIMAGLGSDPWWKKl^LYLTGGALIAAAAyiiLHE 
EJbVIRKQQE IDS KDAI I LHQPARPNNGVPS LS PFCLKMETYIjRM 

aolfyqkyfggklsaqgkmpwieynhekvsgtefii 


7095 


1 


411 


SLECVSHEVDSHYCPSCLEl^PSAEAKLKKNIlCANCPDCPGC^4H 
TLSTRATSISTQLPDDPAKTTMKKAYYIACGFCRWTSRDVGMAD 
KSVGE 


7096 


224 


2067 


ETRSLAVQEKPSQAGRRRSSRtSFAGALFIiTRFUKJELLLNNFC 
SAMSPAPDAAPAPASISLFDliSADAPVFQGIiSLVSHAPGEAIiAR 
APRTSCSGSGERESPERKLLQGPMDlSEIOiFCSTCDQTFQNHQE 
QREHYKLDWHRFNIiKQRLKDKPLLSALDFEKOSSTGDLSSISGS 
EDSDSASEBDI.QTIJ>RERATF'EKLSRPPGFYPKRVIiFQNAQGQF 
LyAyRC\^GPHQDPPEEAELLLQNLQSKGPia5CVVLMA7VAGHFA 
GAIFQGREWTHKTFHRYTVRAKRGTAQGr^RDARGGPSHSAGAN 
LRRYNEATLYKDVRDLLAGPSwAKALEEAGTIIiLRAPRSGRStiF 
FGGKGAPLQRGDPRIiWDIPLATRRPTFQEriQRVLHiCLTTLHVYE 
EDPREAVRLHSPQTHWKTVREERKKPTEEEIRKICRDEKEALGQ 
NEESPKQGSGSEGEDGFQVELELVELTVGTLDLCESEVLPKRRR 
RKRNKKEKSRDQEAGAHRTH^QQTQEEEPSTQSSOAVAAPLGPL 
LDEAKAPGQPELWNALLAACRAGDVGVLKLQLAPSPADPRVLSL 
LSAPLGSGGFTLLHAAT^AAGRGSWRIiLLEAGADPrVQCQDH 


7097 


25S 


1228 


IRTKSAATWEAWPQCGREGSRIITEPCEANAGSRQELQTERISS 
FLAAQGDQAFHSGIiETNNSNSELPLRVGLKVAQGSPLMGGQVSA 



T 



596 



wo 01/53312 



PCT/USOO/34263 



ID 
HO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, G=Cysteine, D^Aspartic Acid, Ei= 
Glutamic Acid, Phenyl alanine* Gs^Glycine, 
H=Histidine, l-isoleucine, K^Lysine, 
L- Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, RstArginine, 
S=:Serine, T-Threonine, V=tValine, 
WsiTryptophan , Y^^ Tyrosine, Xs=Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\=pos3ible nucleotide insertion) ^ 








SNS FSRLHCRNANEDWMSALCPniilTOVPIiHHbSI PGSHDXMTYC 
LNKKS P ISHEES BXjhQhhNKPdjVCX TRPWLKWSVTQALDVTEQ 
IjDAGVRyriDLRIAHMLEGSEKNLHFVHMVyTTAI.VEDTLTElSB 
WLERHPREWI LACRWFEGLSEDLHEYLVACI KNI FGDMIiCPRG 
EVPTLRQI.VISRGQQVIVSyEDESSLRRHHELWPCVPYWWGiaRVK 
TEALIRYLETMICSCGR 


7098 


82 


956 


SSPliKRCRKVUSCWGXPSEQSXjFSTIiEEPRDKEIDNyCVMRLQT 
EARSGFWAPNRFPVNICRMTAVDGDRGGSSRETCRCHFHPSLEA 
LVIiIiLQDMQPGGVGICTSFt/GISWALLDYHRALRTCLPSKPLLG 
LGSS VI y FJ:,WNLIjLLWPRVLAVALFSAIjFPSYVAUIPI>GLWI,VL 
LLWVWLOGTDFMPDPSSEWI>YRVTVATILyFSWFNVAEGRTRGR 
AIIHFAFLLSDSILLVATWVTHSSWIiPSCIPLQLWLPVGCGCPF 
IXSIALRLVYYHWLKPSCCWKPDPDQVD 


7099 


992 


210 


LFRLAPGFLRS IiARQGYHQ I WAFPFLPSGATATWPAASRSRSLA 
ARSIiPRSPARPGPNDALLGEHPFRGQGVRAORFRFSBBPGPGAD 
GAVLE VHVPQIGAGVSLPG I LAAKCGAEVIIiSDSSELPHCliEVC 
RQSCQMNNI.PHU3WG1.TWGHI SWDL.IALPPODI lUASDVPFBP 
EDPED I LAT lY FLMHKNPKVQI/WSTYQVRS ADWSLEALLYKWOM 
KCVHI PLES FD ADKEDl AESTLPGRHTVEMLVI SFAKDSI* 


7100 


20S 


671 


ANGG FWEAAPGSE VSIjPLWVPTASHSKTTALG IGSAPPPHLS VL 
FLFSFPPQtiGDPIiEAFPVFKKYDRNGLiNVSlECKRVSGIjEPATV 
DWAFDIiTKTKMQTMYEQSEWGWKDREKREEMTDPRAWYLIAWEN 
SS VP VAFSHFR FD VERGDE VLYW 


7101 


2 


S03 


WRGGPRRAKRUVSGAVGWVLLVRGVHSVRAGGGRPPRAADMiCKD 
VRILLVGEPRVGKTSLIMSLVSEEFPEEVPPRAEBITXPADVTP 
ERVPTHIVDYSEAEQSDEQLHQEISQANVlCIVYAVNNKHSXt^ ' 
VTSRWIPLINERTDKDSRLPLILGGNKSDLVEYSR ""^^ 


7102 


2 


503 


WRGGPRRAKRXAGGAVGWVLLVRGVHSVRAGGGRPPRAADMKKD 
VRIIiLVGEPRVGKTSLIMSIiVSEEFPEEVPPRAEEITIPADVTP 
ER VPTHI VDYS EAEQSDEQLHQE I SQANVICI VYAVKNKHS IDK 
VTSRW I PLINERTDKDSRLPLILGGNKSDLVEYSR 


7103 


119 


438 


GSQSSVAVNIRSGTDEESMDLMNGQASSVNIAATASEKSSSSES 
LSDKGS ELKKS PDAWFDVLKVTPEE YAGQITLMDVPVFKAIQP 
DELSSCGWNKKEKYSSAP 


7104 


1670 


795 


RLWEHRS VSAGASGWGLSS PGCIiLIiHPSLPEEERVD ILINWAGV 
MRCPHWTTEDGFEMQFGVWHLGEAWAGAAPWVQAIIiPRRPPKVL 
GP*V* VKSDLFI I I*NPGHFl.LTWl4LI*DKLKASAPSRI INLSSLA 

GSGVTWALHPGVARTELGRHTGIHGSTPLQHHN\WAHLLAAWS 
KSPRSWPAPAQHNTIAVAEELA\VISGKYFDGLKQKAPAPEAED 
EEVARRIiWAE5ARIfVGLEAPSVREQPliPR 


7105 


765 


143 


GQMCRRPS PKSTSCJUSMTCDLP/ RGIiQDPQCIiALFRVAVDKHQA 
LLKAAMSGQGVDRHLFALYIVSRFLHLQSPFLTOVHSEQWQLST 
SQ IPVQQMHLFDVHNYPDYVSSGGGFGPADDHGYGVS YI FMGDG 
MrTFHlSSKKSSTKTDSHRLGQHIEPALLDVASLFQAGQHFKRR 
FRGSGKENSRHRCGFLSRQTGASKASMTSTDP 


7106 


14 


1054 


GLQAGHPHPRSASRIPEADTHNYSKIiQRAFDSIVKKDHKRMFGT 
YFRVGFFGSKFGDLDEQEFVYKEPAJTKLPEISHRIiEAFYGQCF 
GAEFVEVIKDSTPVDKTKLDPNKAYIQITFVEPYFDEYEMKDRV 
TYFEKNFIOjRRFMYTTPFTLEGRPRGELHEQYRRNTVIiTTMHAF 
PVr KTR J S VIQKEEF VLTPIEVAI EDMKKKTLQLAVAXNQEPPD 
AKMLQMVLQGSVGATVNQGPLEVAQVPLAEIPADPKLYRHHNICL 

rlcpkefi mrcgeyiveknkriiltadqre yqqelkkwynklkeni* 
rpmierkipelykpifrvesqkrdsfhrssprkcetqlsqgs 


7107 


1145 


591 


*I*WU}fGlCKK 



4 

597 
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SEO 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A= Alanine, C=Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F=t Phenylalanine, G~Glycine, 
H-Histidine, I = Isoleucine, Ksliyaine, 
L=Leucine* M=Methionine, N-Asparagine, 
psproline, Q=Glutainine, RssArginine, , 
S=Serine, TwThreonine/ VssValine, 
W=Tryptophan, Y=Tyrosine, X-Onknown, *=Stop 
Codon< /-possible nucleotide deletion, . 
\=po9Sible nucleotide insertion) ^ 


7i08 


1 


942 


VKVALLbTNI.EQPRTESEWENSFTLKMFLFQFVNIiNSSTFYXAP 
FLGRFTGHPGAYLRLlNRWRI^ECHPSGCIjlDLCMQMGIlMVIiK 
QTWWNFMELGyPLIQNWWrRRKVRQEHGPERKISFPQWEKDYNIi 
QPMNAYGLFDEYLEMILQFGFTTIFVAAPPIAPLLALLNWIIEt 
RLDAYKFVTQWRRPLASRAKDIGXWYGXIiEGXGILSVITKAFVI 
AITSDFIPRLVYAYKYGPCAGQGEAGQKCMVGYVNASLSVFRIS 
DFENRSEPESPGSEFSGTPLKYCRYRDYRDPPHSIiVPYOYTriQF 
WHVIAW 


7109 


964 


102 


WDQRKRNSLVPGPAHGPAQEEPWEKKESLQAAQEAbSIQLQPKE 
1'QPFPKSEQVYLHFLSWTEDGPEPKDKGSLPQPPITEVESQVF 
SEKIiATDTSTFEATSEGTLELQQRHPKAERLRWSPAQEESFROM 
WIHKEIPTGKKDHECSECGKTFIYNSHLWHQRVHSGEKPYKC 
SDCGKTFKQSSNLGQHQRIHTGEKPFECNECX3KAFRWGAHLVQH 
QRIHSGEKPYECNECGKAFSQSSYLSQHRRIHSGEKPPICKECG 
KAYQHCSELIRHRRVHARKEPSH 


7110 


96 


697 


RLDHFSGFLVEVTKEERHIVKPLYDRYRLVKQMLTRASITPVIiG 
SPSTKRRGQMLQP X I EGETAHFFEE I KEEEEDGVNLSSELGDML 
KTAVQVQSSLKNSESDVEENQEKLALDLRtiSSSRAASMPELI*EQ 
LWKAIIAEKKKLRKTIjREFEEAFYQQNGRNAQKEDRVPVIjEEVRE 
YKKlKAKLRLLEVLISKQDSSKSt 


7111 


2 


414 


GSGLYRGPTPGGQCIWKPNSMPPDHBRNFGFTQFAXjEIjNELTAE 
LKRSI.PSXr)TRIiRPDQRYLEEGNlQ7VAEAQKRR lEQLQRDRRKV 
MEElNn^IVHQARFFRRQTDSSGKEWVrVTNNTYWRLRAEPGYGNMD 
GAVLtf 


7112 


103 


49S ^ 


PRCFPVADRGRL I GGLPDWTIMEGKTLNLTCTVFGNPDPEVI W 
FKNDQDIQLSEHFSVKVEQAKYVSMTIKGVTSEDSGKYS INIKt^^ 
KYGGEKXDVTVSVYKHGEKIPDMAPPQQAKPKLI pasasaagq ^ ■ 


7113 


1 


824 


KCLR0AWHE2>PSS LiAFTRKCSREERAEGGGNI>£iRS XTRDPKPPG 
mPSQRPMDbkKKKRSFKPCLAQPAQAPGXLiRRVPVPTSHSGSL. 
ALGLPHLPSPKQRAKFKRVGKEKCRPVLAGGGSGSAGTPLQHSP 
LTEVTDVYEMEGGLLNt>LNDFHSGRLQAFGKECSFEQLEHVREM 
OEKIARIiHFSLDVaSEEEDDEEBElXSVTEGLPEEQKKTMADRNL 
DQLLSNLGSCLGALVPGGMRGGEGTYSQSHSWALGEKVGVHGSK 
SSGPI.NLPRR 


7114 


3 


14 92 


VWEVDEQIDHYKESQDKFLWQAAFlGKETtiKDESGQECKICRKI 

lYIiNTPFVSVKQRLPKYYSWERCSKHHLNFLGQNRSYVRKKDrKS 

CKAYWKVCtHYtOjHKAQPT^RFFDPNQRGKALHQKQALRKSQRS 

QTGEKLYKCTECGKVFIQKANLWHQRTHTGEKPYECCECAKAF 

S QKSTL I AHQRTHTGEKP YE C^ECX3KTFIQKSTLI 

KPFVCDKCPKAPKSSYHLIRHEKTHIRQAFYKGIKCTTSSLIYQ 

RIHTSEKPQCSEHGKAS DEKPSPTKHWRTHTKENIYECSKCGKS 

FRGKSHLSVHQRIHTGEKPYECSICGKTFSGKSHLSVHHRTHTG 

EKPYBCRRCGKAFGEKSTX.rVHQRMHTGEKPYKCNECGKAFSEK 

SPLlKHQRIttTGERPYECTDCICKArSRKSTIiIKHQRIHTGEKPY 

KCSECGKAFSVKSTLIVHHRTHTGEKPYECRDCGKAFSGKSTIil 

KHQRSHTGDKNI, 


711S 


1 


947 


NAAHGYNWGLWCM YI I PPQDWLDRGDESAPIRTPAM IGCS FVVD 
REYFGDIGLLDPGMEVYGGENVKLGMRVWQCGGSMEVXiPCSRVA 
HI ERTRKPYNND XDYYAKRNALRAAEVWMDDFKSHVYMAWNI PM 
SNPGVDFGDVSERI*AIiRQRIiKCRSFKWYLENVYPEWRVYNNTLT 
YGEVRMSKASAYCI.DQGAEDGDRAIIiYPCHGMSSQL.VRYSAI>GIi 
LQLGPLGSTAFLPDSKCJ^VBDGTGRMPTLKKCEDVARPTQRI/WD 
FTQSGPIVSRATGRCLEVEMSKDANFGLRLWQRCSGQKWMIRN 
WIKKARH 


7116 


^ 8S6 


95 


RVRMRRNAEVIEEKLSMKSWAKFRPGEPWKGYPNIDPETDPYVT. 
PGSVINWLS IKTVREVDHLRDRNSGSSSSIiMTTLPSTSAWSS IR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
secfuence 


Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D^Aspartic Acid, 
Glutamic Acid, F=Phenyl alanine, G=^lycine, 
H==ttiatidine, I'alsoleucine, K-Lysine, 
LsiLieucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamitie . R»Arginine, 
S^Serine, T=Threonine, V-Valine, 
Ws:Tryptophan» YaTyrosine, X^Unknovm, *=Stop 
Codon, ^possible nucleotide deletion, 
\»=pofieable nucleotide insertion) C 








ASNVNVPLSSTAQSTSARNSDSKLrWSPGSVTMTSLAHEIiWKVP 
LPPKNIT7VPS RPP PGLTGQKPPLSTWDNS tl^RIGGGWGKSDARY 
TPGSSWGESSSGRITNWr^VLKNLTPQIDGSTIiRTLCMQHGPIiIT 
FHLNLPHGNALVRYSSKEEWKAQKSLHISDLPLtTL 


7117 


695 


1261 


LLI STPGGCH PP PSS lEFTVTGAWGKALPTiPHMPCAPGAIiPQGA 
F VS QAARAI P LLiQPSQAAQAEGLSQPARAOGALCSLPM PLRNWG 
SPI LRLPGGI^RTPTNDRKTRlllSAMACWARAQWDTLGPLKbSHR 
GKVCLRHPRPTGVRGGPGAAGRQGGMGTRRRGTFTSGARDPGGI* 
RVKHRCQPTGHLP 


7118 


4 9 


1863 


PHCEPNPGAGAMVLLHVLFEHAVGYALLALKEVEEISLLQPQVE " ' 

ESVIJJLGKFHSIVRLVAFCPFASSQVALENANAVSEGWKEDLR 

LLIJETHIiPSKKiaCVLLGVGDPKIGAAIQEBLGyNCQTGGVIAEI 

tiRGVRLHFHNLVKGLTDLSACKAQLGLGHSYSRAKVKFNVNRVD 

NMI IQS ISI>LDQLDKDINTFSMRVREWYGYHFPELVKIINDNAT 

YCRlAQFIGNRREIiNEDKLEKLEEliTMDGAKAKAILDASRSSMG 

MDI SAIDLINI ES FSSRWSLSEYRQSt/HTYLRSKMSQVAPSIiS 

AlilGEAVGARX*! AHAGSLTNLAKYPASTVQl LGAEKALFRALKT 

RGNTPKYGLIFHSTFIGRAAAKNKGRISRyLANKCSIASRIIXIF 

SEVPTSVFGEKLREQVEERLSFYETGEIPRKNLDVMKKAMVQAE 

EMSEKPKKKKKQKPQEVPQENGMEDPSISFSKPKKKKSFSKEEri 
MSSDLEETAQSTS IPKRKKSTPKEETVNBPEEAGHRSGS KKKRK 
FSKEEPV3SGPEEAAGKSSSKKKKKFKKASQED 


7110 


49 


1863 


PHCEPNPGAGAMVliLHVLFBHAVGYAX*LALKEVEEISLLQPQVE 

ESVLNIiGKPHS I VRLVAFCPFASSgVALBNANAVSEGWHEDt^R 

LLliETHLPSKKKKVLLGVGDPKIGAAIQEELGYWCQTGGVIAE^ ' 

I^GVKLHFHNl.VKGLTDLSACKAQIX3LGHSYSRAKVKFNVHRVD^-v 

NMI IQS ISLLPQLDKDINTFSMRVREWYGVHPPELVKI INDNAT 

YCRLAQFIGNRRELNEDKLEKLEELTMDGAKAKAIIJDASRSSiyKS 

MDISAIDLINIESPSSRWSLSEYRQSIJITYLRSKMSQVAPSLS 

Ar,IGEAVGAEI.IAHAGSLTNIiRKYPASTVQlLGAEKRIjFRAL.KT 

RGNTPKYGLI FHSTFIGRAAAKNKGRISRVIiANKCSI ASRI DCP 

SEVPTSVFGEKLREQVEERI^FYETGEIPRKHLDVMKEAMVQAE 

EAAAEITRKLEKQEKXRLKKEKKRIAALALASSEJfSSSTPEECB 

EMSEKPKKKKKQKPQEVPQENGMEDPSISFSKPKKKKSFSKEEt* 

MSSDLEETAGSTS I PKRKKSTPKEBTVNDPEEAGHRSGS KKKRIC 

FSKBEPVSSGPEEAA6KSSSKKKKKFHKASQED 


7120 


1991 


64 


QUSTRRCLRGDKVTNAMQDFLVTNLEPRFIEPQTAKLSWFKDS 
NSTTPLrFVLSPGTDPAADIiYKFAEEMKFSKKLSArSUSQGQGP 
RAEAMMRSS IERGKWVFFQNCHLAPSWMPAI*ERLI EHINPDKVH 
RDFRr.Wr.TSLPSWKFPVSILQNGSKMriEPPRGVEANIXKSYSS 
LGEDFLNSCKICVMEFKSLI.LSLCLFHGNALERRKFGPLGFNI PY 
EFTDGDLRICISQIiKMFLDEYDDlPYKVLKYTAGEINYGGRVTD 
DWDRRCrMNILBDFYNPDVLSPEHSYSASGIYHQIPPTYDI.HGY 
LSYIKSLPLNDMPEI PGLHDNANITFAQNETFALLGTI IQLQPK 
SSSAGSQGREEIVEDVTQNILLKVPEPINLOWVMAKYPVLYEES 
MNTVLVQBVIRYKRLLQVITQTLQDLLKALKGLWMSSQLELMA 
ASLYNWTVPELWSAKAYPSLKPt^SWVMDLLQRLDFLQAWIQDG 
r PAVFWISGFFFPQAPLTGTLKJNFARKFVIS IDTISFPFKVMFE 
APSELTQRPQVGCYIHGLFLEGARWDPEAFQLAESQPKELYTEM 
AVIWLLPTPNRKAQDQDFYLCPIYKTLTRAGTLSTTGHSTNYVT 
AVE I PTHQPQRHWI KRGVAIil CALDY 


7121 


2 


546 


RPLRPWVDSIiGSMVGLMTYGRRQFQSLDTTMRRLIPPFREASAK 
LTTLVDADAEAFtAYLEAMRLPKNTPEE KDRRTAALQEGLRRAV 
S VPLTIiAETVAS P ALQELAROGNLACRSDLQVAAKAIiEMGVF 
GAYEma.INLRDIT0EAFKDQIHHRVSSLU3EAKTQ7U\LVI*DCL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first . 
amino acid 
residue of 
amino acid 
eequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid 3e9meiit containing signal peptide 
{A-Alanine, C=Cysteine, D=Aspartic Acid, 
Glutamic Acid, Phenylalanine , G-Glycine, 
Ht=Histidine, I^=lsoleucine, K^Lysine, 
L=Leucine, M-Methionine, N^Asparagine^, 
psProline, Q=Glutamine, R=Arginine, 
Sr:Serine, T^^Threonine, V^Valine, 
Wx= Tryptophan, Y= Tyrosine, X«Unknown, *=Stop 
Codon, /=pos3ible nucleotide deletion, 
\=po$sible nucleotide insertion) v 








ETRQE 


7122 


2 


546 


R PLRPWVLS LG SMVGLMT YGRSQFQSLDTTMRRLI PPFREASAK 
LTTLVDRDAEAFTAYLEAMRLPKNTPEEKDRRTAALQEGL^iRAV 
SVPLTIAETVASLWPALQEIARCGNIACIISBLQVAAKALEMGVF 
GAYFNVXjINLRDITDEAFKPQIHHRVSSIMEAKTQAAIiVLOCL 
ETRQE 


7X23 


1 


1092 


KPAVPEARSAGTSEAGRSGAEEVSCGSVSGDGAAMRLTPRALCS 
AAQAAWRENFP I*CX3RDVARMFPGHMAKGLK5UflOSSljKliVDCI IE 
VHDAR I PLSGRMPLFQETLGLKPHLLVLNKMDIADLTBQQKIMQ 

HLEGEGLKNVI ftncvkdenvkqi I pmvteligrshryhrkenl 

EyCIMVIGVPKVGKSSIiIKSLERQHIiRKGKATRVGGEK3ITRAV 
MS KIQVSERPLMFLIiDTPG VIAPRI ESVETGLKLALCGTVI^HL 
VGBETMADYLLyTLNKHQRFGYVQHYGLGSACDNVERVliKSVAV 
KIjGKTQKVKVLTGTGNVKVIQPNYPAAAROFLQTFRRGIjI/GSVM 
LDLDVIiRGHPRV 


7124 


2 


382 


LPLTLLLAAP FAHLLLPPGHDQSPCWHPGPALS PGTliGPLSWAM 
ANSGLQLLGVFL7VLGGWVG 1 1 ASTALPQWKQSSYAGDASl QLRS 
KVFVLESEWGGDSLGLPRDCGWSCLLHSAVRSEKGFWS 


712S 


166 


1127 


NCIS EKRNYS F SMQKGKGRTSRIR3flRKLCGSSESRGVWESHKSE 
FIELRKWLKARKFQDSNLAPACFPGTGRGIiMSQTSliQEGaMriS 
LPESCLliTXRDTVIRSYIiGAYITKWKPPPSPLIiAUrrFLVSEKH 
AGHRS LibE A\ Y liE 1 LPKAY TCPVCLEPE WNLLP KSLKAKAEEQ 
RAHVQEFP7^SRDFFSSLQPLFAEAVDSlFSYSAI*LM7yMCTVNT 
RAVYL \ S PGSGNAFLQSRTPVQLAP YLDLTJ^HS PHVQVKAAFNE 
ETHSYEIRTTSRWRKHEEVFXCYGPHDNQRLPLEYGFVSVHNPH 
ACVYVSRGWNQIiCS ^ 


7126 


1 


733 


CRDMAAFI VPS PARRCSQKGSLGHLPTQPWLW AAMS PRGQERGT " 
SHSQAREPQRPGRWI*U3SL0SSPGTLGQAGTASRRRGCMVQRWV 
QVATGRRAVQV P KGALGIAIiGETS PGASRGMSGGAGGCWALGWA 
PSPVLPSWLLEGPPPWLS I ISDSGTQRPSPRRCPARPSPWGPQC 
WRGGRIASAEASST*TPGSGSRARSGRRSPGSRRRSASAPSPTP 
PTDACA* SCVARPAGSRSSRPAAA 


7127 


1311 


277 


GL»P;^CST* KAGYYEETEGPClPKXtR* lEKRPFKEI *RRIPRI F 
AKQKQ I * S*NSQKIGASElDRGRKEADCSnAPAAARIGAVSVFR 
RSTQEARVSPRSNAKSANIiRAVRAD*WEHFVLIjFHTPEQFLAEC 
ICRST* * K* WHQLC* PLSSIi*TGI.KRKLLL*VLFRI *WLKDCDV 
* FGQKI FATNFC2JWQNLIQ* EE * KPVEYSVEN*HIMNLLI.PM*L 
CQSSLRDQTI VTWRM * RNY SMFRINMlSSIi* DGS IHI PLKLHFY 
PALIFTLTVPINSCCQRPIiPLFAHQSIKTIjASSGSPMLACLRFIi 
LVKKRAFIHTPRS PGCS V* CKHVLVKDWKNNCVGSEV 


7128 


2 


5228 


GRVDLWTIIiIjGRSALRELSQlEAElKKHMRRLLEGI>SYYKPPSP 
SSAEKVKANKDVAS PLKELGLRIS KFIiGLDEEQS VQLTUQC YLQE 
DYRGTRDSVICrVI>QDERQSQAIiILKIADYYYEERTCIt,RCVLHI* 
LTYFQDERHPYRVEYADCVDKLEKELVSKYRQQPEELYKTEAPT 
WETHGULiMTERQVSRWFVQCLREQSMLIiEIIFLYYAYPEMAPSD 
LliVtiTKMFKEQGFGSRQTNRHLVDETMDPFVDRIGYFSAIilLVE 
GMDIESlJIKCALDDRREl^QPAQDGrilCQDMDCLMIiTPGD I PHH 
APVt.LAWALljRHTLNPEETSSWRKIGGrAIQLNVFQYtiTRIjIiQ 
S IiASGGNDCTTSTACWCVYGH,SFVLTSI.EUiTLGNQQDI IDTA 
CE VLADPSLPELFWGTEPTSGLG I IIiDSVCGMFPHLLSPl.IiQLli 
RALVSGKSTAKKVYSFLDKMSFYMELYKHKPHDVTSHEDGTLWR 
RQTPKLLYPU3GQTKLRIPQGTVGQVMLDDRAYLVRWEYSySSW 
TLFTCE XEMLbHVySTADVXQKCQRVKP I IDLVHKVI STDLSI A 
UCtiLPITSRI YMLLQRLTTVIS PPVDVIA5CVNCr.TVX*AARNPA 
KVWTDLRHTGFL PFVAHPVSSLS QMIS AEGMNAGGYGNL W4NSE 
QPQGEYGVTIAFLRIilTTLVKGQLGSTQSQGLVPCVMFVIiKEMIi 
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PCT/lISOO/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predictea end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide ~ 
(A=Alanine, C= Cysteine, D=Aspartic Acid, E== 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=5Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine^ T- Threonine, V= Valine, 
W«Tryptophan, Y-Tyrosine, Xr=Unknown, *-Stop 
Codon, /"^possible nucleotide deletion,. 
\=possible nucleotide insertion) 








PSyHKWRYNSHGVREQIGCLILELlHAILUIiCHETDLHSSHTPS 
liQFIjCI CSLAYTEAGQTVINIMGI GVDTIDMVMAAQPHSDGAEG 
QGQGQLLIKTVKLAFSVTNNVIRIiKPPSNVVSPI.KQALSQHGAR 
GNNLIAVLAKYXyHKHPPALPHLAlQLLKRLATVAPMSVYACLG 
NDAAAIRDAFLTRLQSK\IE\DMRIK\VMXL\EFLTVA\VETQP 
GLIELFLNLEVKDG\SDGSKEFSLGMW\SCLHAV/VWEI*IDSQQ 
QDRYWCPPLLHRAAIAFLHALWQDRRDSAMLVLRTKPKFWEKIjT 
SPLFGTLSPPSETSEPSILETCALIMKIICLBIYYWKGSLDQP 
LKDTLKKFS I EKRFAYWSGYVKSLAVHVAETEGSSCTSIjLEyQM 
LVSAWRMLIj 1 1 ATTHAPIMHLTDS WRRQLFLDVLDGTKALLliV 
PASVNCLRLGSMKCTLLLILLRQWKREIiGSVDElLGPLTEILEG 
VLQADQQL^ffiKTKAKVFSAFXTVLQMKEMKVSDlPQYSQI^VLNV 
CETLQEEVIALFDQTRHSLALGSATEDKDSMETDDCSRSRHRDQ 
RDGVCVLGLHLAKELCEVDEDGDS WIjQVTRRLP I LPTLLTTLE V 
SLRMKQNLHFTEATLHI^LLTLARTOQGATAVAGAGITQSICLPL 
IiSVyQLSTNGTAQTPSASRKSLDAPSWPGVYRt.SMSI*MEQLL!er 
liRYNFLPEALDPVGVHQERTLQCIiNAVRTVQSIiACIiEEADHTVG 

filqlsnfmkewhfhlpqlmrdiqvni.gylcqacrsfiihsrkmr. 
qhylonkngdglpsavVaqrvXqrppsaasaapssskqpaadte 

A5EQQALHTVQyGr,riKILSKTJuAALRHFTPDVCQXi:.LDQSLDIA 
EYKFI.FALSFTTPTFDSEVAPSFGTLLATVNVAliNMLGELDKKK 
EPLTQAVGLSTQAEGTRTLKSLLMFTMENCFYIiIilSQAMRYLRD 
PAVHPRDKQRMKQELSSELSTIiIiSSIiSRYFRRGAPSSPATGVBP 
S PQGKSTSLS KAS PESQEPLIQLVQAFVRHMQR 




1 


1054 


FRRFRWRRRLH*AGPASSAGGSPGEASGTMSGELPPNINIKEPR 
WDQSTFIGRANHFFTVTDPRNII1LTNEQLESARKIVHDYRQ9IV 
PPGLTENEliWRAKYI YDSAPHPDTGEKMI LIGRMSAQVPMN^?tI 
TGCMMTFYRTTPAVIiFWQWINQSFlJAVVNYTNRSGDAPLTVNEIi 
GTAYVSATTGAVATALGLNALTKHVSPIilGRFVPPAAVAAANCI 
NX PLMRQRELKVG I PVTDENGKRLGESANAAKQAITQWVSRIL 
MAAPGMAIPPFIMNTLEKKAFLKRFPWMSAPlQVGLVGFCIiVFA 
TPLCCALFPQKSSMSVTSLBAELQAKIQESHPELRRVYFNKGIt 


7130 


2 


780 


HEVPSLQTSDPLPGSVQRCSVWSQPNKENWCQDHLYNSLGRKG 
ISAKSQPYHRSQSSSSVLINKSMDSIWYPSDVGKQQLLSLHRSS 
RCESHQDLLPDIADSHQQGTEKI*SDliTI*QDSQKVVVVKRNLPIJtI 
AQIATQNYFSNFKETDGDEDDYVEIKSEEDESELEIiSHNRRRKS 
PS KFVDADFSDIJVCSGNTLHS IiNSPRTPKKPVNSKLGLS P YLTP 
YNDSDKLNDYIiWRGPSPNQQNIVQSLREKFQCLSSSSFA 


7131 


805 


573 


AAAEGHIE WKFLI EACKVWPFAKDRWGNI PLDDAVQFNHLEW 
iCLLQDYQDS YTLS ETQAEAAAEALS KENLESMV 


7132 


1420 


1087 


I DMt.l,LSGAIjVSGP YTL ITTAVSADLGTHKSLKGNAHALSTVTA 
RL IHKELSCPGSATGDQVPFKEQ 


7133 


2 


3648 


OQIPGLLPAHGESGDAIiRKPRLQKPITGHLDDIiFFrLYPSZiEKF 
EEELLELHVQDHFQEGCGPi:iDGGALEII,ERRI.RVGVHNGLGFVQ 
RPQVWLVPEMDVAIiTRSASPSRKWSSSKTSSGSQAIiVIiRSRl. 
RLPEMVGHPAFAVI FQLEYVFSS PAGVDGMAAS VTSLSNIACMH 
M7RWAWrNPLLEAr)SGRVTt.PLQGGlQPNPSHCLVYJCVPSASMS 
SEBVKQVESGTIiRFQFSLGSEBHLDAPTEPVSGPKVERRPSRKP 
PTS PSS PPAPVPRVIAAPQWS PVGPGLS ISQIAAS PRSPTQHCL 
ARPTSOLPHGSQASPAQAOEFPLEAGISHLEADLSQTSLVLETS 
lAEQLQELPFTPLHAPIWGTQTRSSAGQPSRASMVLLQSSGFP 
EIIiDANKQPAEAVSATEPVTFNPQKEESDCLQSNEMVLQFliAFS 
RVAQDCRGTSWPKTVYFTFQFYRFPPATTPRLQLVQLDEAGQPS 
SGALTHILVPVSRDGTFnAGSPGFQXiRYMVGPGPIiKPGERRCFA 
RYIAVQTLQIDVWDGDSLLLlGSAAVQMKHLLRQGRPAViSASHE 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location. 

CO r re spondi ng 

CO first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid cegment containing signal peptide 
{A=Alaninc* C=iCy3teine, D«Aspartic Acid, Es: 
Glutamic Acid, F= Phenyl alanine^ G=Glycine, 
H=Histidine, l*:Isoleucine, K-Lysine, 
L=ljeucine, M=Methionine, N=Asparagine, 
P=:Prolln.e, Q-Glutamine, R=AxgiTiine* , 
S=Serine, T=Th5reonine, V*=tValine, 
w=Tryptophan, Y = Tyros ine^ X^Unknown, *«Stop 
Codon^ /^possible nucleotide deletion, ^ 
\«posaible nucleotide insertion) 








LEWATE^EQDNMWSGDMI-GFGRVKPIGVHSWKGRLHliTLAN 
VGHPCEQKVKGCSTIjPPS RSRVISNDGASR FSGGSI.LTTGSSRR 
KHVVQAQKLADVDSELAAMLLTHARQGKGPODVSRESDATRRRK 
LERMRSVRLQEAGGDLGRRGTSVLAQQSVRTQHLRDLQVIAAYR 
ERTKAESI ASLLSLAI TT£HTLHATI*GVAEFFEFVLKNPHNTQH 
TVTVEXDNPELSVIVDSQEWRDFKGAAGLHTPVEEDMFHIjRGSL 
APQIiYL.RPHETAHVPPKPQSFSAGQIiAMVQASPGLSN£KGMnAV 
S PWKS S AVPTKHAKVI*FRASGGKP I AVIiCLTVELQPHWDQVFR 
FYHPELS FLKKAIRLP P WHT FPGAPVGMLGEDPP VHVRCSDPNV 
I CETQNVGPGEP RDI FLKVASGPSPEI KDFFVI^ YSDRWIATPT 
QTWQ VYLHSIiQRVDVS CVAGQLTRLSLVLRGTQTVRKVRAFTSH 
PQELKTDPKGVPVLPPRGVQDLHVGVRPLRTVGSRFVHLNIiVDVD 
CHQLVAS WLVCLCCRQPL ISKAFE IMLAAGEGKGVNKRITYTNP 
ypSRRTFHLHSDHPEi:iI.RFREDSPOVGGGETYTrGLQFAPSQRV 
GEEEIIiIYINDHEDKEsTEEAFCVKVlYQ 


7134 


2115 


1111 


GGEGFSYPPHVGLSU5TPIjDPHYVLI*EVHYDNPTYEEGLIDNSG 
LRLFYTMDIRKYDAGVI EAGLWVSI>FHTI PPGMPEFQSEGHCTI* 
ECLEEALEAEKPSGIHVPAVLLHAHLAGRGIRLRHFRKGKEMKX* 

laydddfdfnfoefqylkeeqtilpgdi^litecryntkdraemt 
wgglstrsemclsyllyyprinltrcasipdimeqlqfigvkei 
yrpvttwpfiikspkqyknlsfmdamnkfkwtkkeglsfnklvi, 

SIjPVNVRCSKTDNAEWSIQGMTALPPDIERPYKAEPXiVCGTSSS 
SSLHRPFS INLLVCLUiiLS Cri*STKSL 


7135 


2 


2072 


FVPRVT PRSIiSLQGP KGESVGS XTQPLPSSYLI FRAASESDGRC 

wlidalblaiircss llrlgtckpgrdgepgts pdas psslcglpa 
satvhpdqdlfplngsslendafsdkserenpeesdtetqdhsh 
ktesgsdqsetpgapvrrgttyveqvqee1jgelgea5qvetvse " 
enkstimwtixkqlrpgmdlsrwlptpvleprspijnkiisdyyyh 
adiilsraaveedaysrmkbvxirwylsgfykkpkgrkkpynpxlg 
etfrccwfhpqtdsrtfyiaeqvshhppvsafhvsnrkdgfcis 
gs itaksrfygnslsalildgkatlttfi*nraedytltmpyahckg 
hiygtmtlielggkvt recaknnfqaqlefklkpffggsts inqi 
sgkitsgeevlaslsghwdrdvfikeegsgssalfwtpsgevrr 

QRLRQHTVPLEEQTELESERLWOHVTRAISKGDQHRATQEKPAIi 
EEAQRQRARERQESLMPWKPQLFHLDPITQEWHYRYEDHSPMDP 
LKDIAQFEQDGILRTLQQEAVARQTTPLGSPGPRHERSGPDQRli 
RICA^DQPSGHoQAT£o5GSTPE3CPEItbDEbWlXTDr vPGuKl>FC 
PRCRKEARRDQALHEAIIiS IREAQQEIiHRHUSAMIiSSTARAAQA 
PTPGLLOS PRSWFLLCVFLACQLFINHIIiK 


7136 


2 


418 


dfvpsfrrpsgntsqtvv?llraatlekevaglrekimhl»ddmlk 
sqqrkvrqmieqlqnskaviqskdatiqelkekiaylbaenlem 
kdrmehliekqishgwfstqaraktenpgsiriskppspkpmpv 

IRWET 


7137 


2 


466 


wasgmstvpggsrhslgiqvrggwgvtggeeesltvpvadtwqa 
gsfkvatqernpqraqmrlirrqkkgwpfiigdfltelqrlrjsai 
pddldgntnkrskevrvlqemqi.lqvaamkyrrirplekfvp'ypt 
rmeqlsdkesyklscqlepenp 


7138 


2 




WASGMSTVPGGSRHSLGIQVRGGWGVTGGEEESLTVPVADTWQA 
GSFKVATQERNPQRAQMRLRRQKKGWPFLGDFLTEliQRLDSAI 
PDDLIXSNTNKRSKEVRVUJEMQXiLQVAAMNYRLRPIjEKFVTYFT 
RMEQLSDKESYKLSCQLEPENP 


7139 


1 


357 


SLRNSARGLKMAASAARGAAAIiRRSINQPVAFVRRlPWTAASSQ 
I.KEHFAQFGHVRRC;rLPFPKETGrHRGLGWVQFSSEEGLRNAI^ 
QEMHI IDGVKVQVHTRRPKLPQTSDDEKKDF 


7140 


1401 


1957 


RASSJUQVLKAWGGLI PSSFQQQHTGQYALEELFDLKVYDCFCS F 
NMNVSLEKQLRPSQPVrPRGKCRKTPGWEEARPKAQDLRGDliGKT 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
citnino cLCid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A==Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G=5Glycine, 
H=Histidine, I-Isoleucine, K=Lysine, 
L=Leucine, M-Methionine, N-Asparagine, 
P=: Proline, Q=Glut amine, R-Arginine, 

W-Tryptophan, Y-Tyrosine, X= Unknown, ♦^Stop 
Codon, /-possible nucleotide deletion,. 
\=possible nucleotide insertion) 








WTPKGQDPPLMFS EDYQKSLLiEQVHLGUDQKLRKyVVGElil WNF 
ADFMTNQCX5 


7141 


124 


1073 


LDSRSCWIiDMEDLEEDVRFIVDETLDFGGLSPSDSREEEDlTVL 
VTPE KPLRRGLSHRSD&NAVAPAPQGVRLSLGPIjSPEKLEE I LD 
EANRLAAQLEQCALQDRESAGEGLGPRRVKPSPRRETFVLKDSP 
VRDLIjPTVNSLTRSTPS /LKQPDASTPE* * *EGVSQGSPGYIWK 
EALQHEEGVTHLQSVPCIQKPS I FSS\SRSTPPVRGRAGPSGRA 
AAS EETRAAKLRGAAAKS S CQLP I PSAI PRPASRMPLTS RS VPP 
GRGAIiPPDSLSTRKGLPRPSTAGHRVRESGHKVPVSQRLNLPVM 
GATRSNLQPP 


7142 


658 


839 


L I FLMI*HMEIiKMLS S VTLHIRAFIiY WI CLKPTSCL IFQNVXjNLL 
KK*SRAVGWWMCRT/YSSDr,QVGVIKPWIiLLGSQDAAHDLDT 
LKKNKVTHII*NVAYGVENAFI»SDFTYKS I S I LDLPETNILS YFP 
ECFE F I E EAKRKIX5 Vvl* VHCNA 


7143 


3 


773 


SIiEMSSDGEPLSRMDSEDSISSTIMDVDSTISSGRSTPAMMNGQ 
GSTTSSS KNI AYNCC WDQCQACFNSS PDLADHIRS IHVDGQRGG 
VFVCLWKGCKVYNTPSTSQSWIiQRHMLTHSGDKPFKCWGGCNA 
S FASQGGIARHVPTHFSQQNSS KVSSQPKAKEESPSKAGMNKRR 
KLKNKRRRSIiARPHDFFDAQTLDAIRHRAlCFNLSAHIESIiGKG 
HSWFHSTVSILLFFQIKYKTXtQKNISTIISKSLKI 


7144 


1 


988 


FRVNMQDGGPSPAEHSKAEESAGMEARFLGLPDAAGSSGPTPAR 
RCPAPRPAGVSYVIRDEVEKYNRNGVNALQLDPAIWRLFTAGRD 
SIIRIWSVNQHKQDPYIASMEHHTDWVNDIVLCCNGKTLISASS 
DTTVKVWNAHKGFCMSTLRTHKDYVKATAYAKDKELVASAGIJDR 
QI FLWDVNTLTALTASNNrVTTSSLSGHTKDS lYSLAMNQLGTX I 
VSGSTEKVLRVWDPRTCAKLMKliKGHTDNVKALt,L^mDGTO 
GSSDGTIRLWSLGQQRCIATYRVHDEGVWALQVNDAFTHVYSGG 
RDRKI YCTDLRNPD I RVIil CE 



TRADOCS: 14 1 6260. t (%CSKO 1 ! .DOC) 
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WHAT IS CLAIMED IS: 

1. An isolated polynucleotide comprising a nucleotide sequence selected from the 
group consisting of SEQ ID NO:l-1786 and 3573-535S, a mature protein coding portion 
of SEQ ID NO:r-1786 and 3573-5358, an active domain of SEQ ID NO: 1-1 786 and 
3573-5358, and complementary sequences thereof. : 

2. An isolated polynucleotide encoding a polypeptide with biological activity, 
wherein said polynucleotide hybridizes to the polynucleotide of claim 1 under stringent 
hybridization conditions. 

3. An isolated polynucleotide encoding a polypeptide with biological activity, 
wherein said polynucleotide has greater than about 90% sequence identity with the 
polynucleotide of claim 1 . 

4. The polynucleotide of claim 1 wherein said polynucleotide is DNA. 

5. An isolated polynucleotide of claim 1 wherein said polynucleotide comprises the 
complementary sequences. 

6. A vector comprising the polynucleotide of claim 1 . 

7. An expression vector comprising the polynucleotide of claim L 

8. A host cell genetically engineered to comprise the polynucleotide of claim 1 . 

9. A host cell genetically engineered to comprise the polynucleotide of claim 1 
operatively associated with a regulatory sequence that modulates expression of the 
polynucleotide in the host cell. 

1 0. An isolated polypeptide, wherein the polypeptide is selected from the group 
consisting of: 

I' 
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(a) a polypeptide encoded by any one of the polynucleotides of claim 1 ; and 

(b) a polypeptide encoded by a polynucleotide hybridizing under stringent 
conditions with any one of SEQ ID NO: M786 and 3573-5358. 

11. A composition comprising the polypeptide of claim 10 and a carrier. ; 

12. An antibody directed against the polypeptide of claim 10. 

■13. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a 
complex with the polynucleotide of claim 1 for a period sufficient to form the complex; 
and 

b) detecting the complex, so that if a complex is detected, the 
polynucleotide of claim 1 is detected. 

14. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample mder stringent hybridization conditions 
with nucleic acid primers that anneal to the polynucleotide of claim 1 under such 
conditions; 

b) amplifying a product comprising at least a portion of the 
polynucleotide of claim 1 ; and 

c) detecting said product and thereby the polynucleotide of claim 1 in 

the sample. 

15. The method ol claim 14, wherein the polynucleotide is an RNA molecule and the 
method further comprises reverse transcribing an armealed RNA molecule into a cDNA 
polynucleotide. 

1 6. A method for detecting the polypeptide of claim 1 0 in a sample, comprising: 

i 
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a) contacting the sample with a compound that binds to and forms a 
complex with tlie polypeptide under conditions and for a period sufficient to form the 
complex; and 

b) detecting formation of the complex, so that if a complex formation 
is detected, the polypeptide of claim 10 is detected. * 

17. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10 under 
conditions sufficient to form a polypeptide/compound complex; and 

b) detecting the complex, so that if the polypeptide/compound 
complex is detected, a compound that binds to the polypeptide of claim 10 is identified. 

18. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10, in a 
cellj under conditions sufficient to form a polypeptide/compound complex, wherein the 
complex drives expression of a reporter gene sequence in the cell; and 

b) detecting the complex by detecting reporter gene secjuence 
expression, so that if the polypeptide/compound complex is detected, a compound that 
binds to the polypeptide of claim 1 0 is identified. 

19. A method of producing the polypeptide of claim 10, comprising, 

a) culturing a host cell comprising a polynucleotide sequence selected 
from the group consisting of a polynucleotide sequence of SEQ ID NO:l-1786 and 3573- 
5358, a mature protein coding portion of SEQ ID NO: 1-1 786 and 3573-5358, an active 
domain of SEQ ID NO:l-1786 and 3573-5358, complementary sequences thereof and a 
polynucleotide sequence hybridizing under stringent conditions to SEQ ID NO:I-1786 
and 3573-5358, under conditions sufficient to express the polypeptide in said cell; and 

b) isolating the polypeptide from the cell culture or cells of step (a). 

4 
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20- An isolated polypeptide comprising an amino acid sequence selected from the 
group consisting of any one of the polypeptides SEQ ID NO:1787 -3572 and 5359-7144, 
the mature protein portion thereof, or the active domain thereof. 

2 1 . The polypeptide of claim 20 wherein the polypeptide is provided on a polypeptide 
array. 

22. A collection of polynucleotides, wherein the collection comprising the sequence 
information of at least one of SEQ ID NO:l-1786 and 3573-5358. 

23* The collection of claim 22, wherein the coliection is provided on a nucleic acid 
array. 

24. The collection of claim 23, wherein the array detects full-matches to any one of 
the polynucleotides in the collection. 

25. The collection of claim 23, wherein the array detects mismatches to any one of 
the polynucleotides in the collection. 

26. The collection of claim 22, wherein the collection is provided in a computer* 
readable format. 

27. A method of treatment comprising administering to a mammalian subject in need 
thereof a therapeutic amount of a composition comprising a polypeptide of claim 1 0 or 20 
and apharmaceutically acceptable carrier. 

28. A method of treatment comprising administering to a mammalian subject in need 
thereof a therapeutic amount of a composition comprising an antibody that specifically 
binds to a polypeptide of claim 1 0 or 20 and a pharmaceutically acceptable carrier. 
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